Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Splitting string variables without parse strings


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: Splitting string variables without parse strings
Date   Fri, 8 May 2009 14:33:38 +0100

Note that the manual entry [D] split does point quite explicitly to -substr()- for such problems. 

Nick 
[email protected] 


-----Original Message-----
From: Nick Cox 
Sent: 08 May 2009 14:29
To: '[email protected]'
Subject: RE: Splitting string variables without parse strings

You are correct about -split-. As the original author of -split- I can comment. 

It was designed specifically to cope with strings containing one or more parse strings, typically not necessarily single characters such as spaces or commas. 

I thought quite a bit about extending it to your kind of problem, but could see no easy way to do that (a) did not complicate the syntax mightily and (b) was an improvement on direct use of -substr()-. 

-substr()- is, and has long been, the method of choice for your kind of problem. 

forval i = 1/4 { 
	gen sitc_`i' = substr(sitc, `i', 1) 
} 

is a solution to your problem, modulo your exact variable names. Thus it isn't very tricky at all. 

Nick 
[email protected] 

Ben Carpenter

I have got a problem splitting one string variable into four new stringvariables each containing a part i.e. 1digit 

of the former 4digit string variable. 

My strings look like "103A" or "009X" (without the "") i.e. I have got 4 digit codes which contain numbers and 

letters.

I want to generate four new variables each consisting of one digit of the former 4digit-string variable.



An Example:

sitc(variable name of the 4digit string var):
103A
09XX

After the split the data should look like:

sitc_1st_digit	sitc_2nd_digit	sitc_3rd_digit	sitc_4th_digit
1		0		3		A
0		9		X		X

As far as I know,  the split command needs characters as parse_strings to know when to "cut" the string. But I don

´t have any parse strings within my 4digit string variable. How can I cope with that? Is there another command(or 

combination of commands) apart from "split" which can deal with the problem?

I would very much appreciate help with this tricky problem.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index