Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: editing string variables to remove letters and keep only numbers

From   Michael McCulloch <>
Subject   st: editing string variables to remove letters and keep only numbers
Date   Mon, 17 Jun 2013 15:53:41 -0700

I have a variable in my dataset that (due to changes in data entry practices over time) contains several styles of the variable ID:

	- a number (e.g. 164)
	- a letter-number combination (e.g. e64)
	- a comma-separated letter-number combination (e.g. e64,e65) 

In seeking to (A) remove the letters, and (B) separate the comma-separated into two separate variables, ID1 and ID2, I wrote the following argument:

. split ID, p(",")
. gen str id1_new =""		// make new ID to separate out the "e" from ID
. replace id1_new=substr(id1,2,3) 

This successfully splits ID into ID1 and ID2.

This also works if: 
	a 3-digit variable has a preceding letter (e64 is changed to 64)
However, in the case of a 3-digit values WITHOUT PRECEDING LETTER, the first digit is removed (164 is changed to 64).

Any suggestions would be appreciated.

Best wishes,
Michael McCulloch, LAc MPH PhD

Pine Street Foundation, since 1989
124 Pine Street | San Anselmo | California | 94960-2674  
P: (415) 407-1357 | F: (206) 338-2391 |

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index