Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Eric A. Booth" <eric.a.booth@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: How do I split a string variable without spaces by capital letters? |
Date | Mon, 19 Aug 2013 10:36:13 -0500 |
<> Agreed, -moss- is great for this, but also you can do this using built-in string functions if you are interested, example: *****************! clear all inp str13(v1) "TestOne" "ThisistestTwo" "AndThree" end g v2 = reverse(v1) g pos = . g l = length(v1) foreach x in `c(ALPHA)' { replace pos = strpos(v2, "`x'") if inlist(pos, ., 0, l) } drop v2 g first = substr(v1, 1, l-pos) g second = substr(v1, l-pos+1, l) list *****************! EAB On Mon, Aug 19, 2013 at 10:31 AM, Robert Picard <picard@netbox.com> wrote: > You can use -moss- (available from SSC) to handle this problem. The > following works with your example: > > moss v1, match("([A-Z][^A-Z]*)") regex > > The pattern indicates that you are looking for substrings that start > with a capital letter (i.e [A-Z]) followed by zero or more non-capital > letters (i.e. [^A-Z]*). > > On Mon, Aug 19, 2013 at 10:06 AM, Andrew Dickens <adickens@econ.yorku.ca> wrote: >> Hi all, >> >> I'm currently running Stata 10, and I'm having a problem splitting a string >> variable by capital letters. Elena Vidal posted something under a similar >> title, http://www.stata.com/statalist/archive/2011-11/msg01195.html, but the >> her problem is somewhat different than mine and I was unable to >> troubleshoot. >> >> An example of my data is as follows: >> >> clear all >> inp str13(v1) >> "TestOne" >> "ThisistestTwo" >> "AndThree" >> end >> >> The problem is the capital letter I wish to split each cell by is not >> consistently placed. >> >> I tried splitting using this code: >> >> split v1, p(upper(a-z)) >> or >> split v1, p(upper(.)) >> >> but this just generates an identical variable to v1. >> >> What I would like to do is create two new variables, so the first >> observation of my example would have "Test" in the first new variable and >> "One" in the second new variable. Suggestions would be greatly appreciated. >> >> Thank you for your consideration. >> >> Andrew >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/