Re: st: using the command substr in stata

 From Svend Juul To Subject Re: st: using the command substr in stata Date Wed, 30 Jul 2008 09:44:37 +0200

```Carmen wrote:

I have a string variable called disease_ICD (oldvar) which has
the values of "International Statistical Classification of
Diseases and Related Health Problems - ICD 9 and ICD 10"

I need to create a new variable disease_ICDgroup (newvar)
containing grouped values of disease_ICD (oldvar). The
equivalent in EPI INFO is:

. Define newvar TEXTINPUT

. IF(substring(oldvar,1,3)>="001"AND substring(oldvar,1,3)<"799"THEN
ASSIGN newvar ="1" END

. IF(substring(oldvar,1,3)>="A00"AND substring(oldvar,1,3)<"U99"THEN
ASSIGN newvar ="1" END

. IF(substring(oldvar,1,3)>="800"AND substring(oldvar,1,3)<"999"THEN
ASSIGN newvar ="2" END

. IF(substring(oldvar,1,3)>="V00"AND substring(oldvar,1,3)<"Y99"THEN
ASSIGN newvar ="2" END

Note: The first three characters from oldvar are the same
in all banks (more than 20 banks) which allowed me to create
ranges and commands that can be used in all banks.

How do I do this in STATA?

===============================================================

First, I would generate a help variable -old3- since this is used
repeatedly:

generate str old3 = substr(oldvar,1,3)

Next, it goes:

generate str newvar = ""
replace newvar = "1" if old3>="001" & old3<"799"
replace newvar = "1" if old3>="A00" & old3<"U99"
replace newvar = "2" if old3>="800" & old3<"999"
replace newvar = "2" if old3>="V00" & old3<"Y99"

I wonder, however, if you want -newvar- to be string; numaric
variables are handier:

generate newvar = .
replace newvar = 1 if old3>="001" & old3<"799"
replace newvar = 1 if old3>="A00" & old3<"U99"
replace newvar = 2 if old3>="800" & old3<"999"
replace newvar = 2 if old3>="V00" & old3<"Y99"

Note that you may use the relational operators > and < with
strings. The rule is that strings follow dictionary sequence;
however, all uppercase letters come before lowercase, numbers
come before letters, and spaces or blanks come before
anything else. So:

" " < "12" < "2" < "A" < "AA" < "Z" < "a"

You could have found information about the substr() function by:

findit substring

Hope this helps
Svend
__________________________________________

Svend Juul
Institut for Folkesundhed, Afdeling for Epidemiologi
(Institute of Public Health, Department of Epidemiology)
Vennelyst Boulevard 6
DK-8000  Aarhus C, Denmark
Phone:  +45 8942 6090
Home:   +45 8693 7796
Email:  sj@soci.au.dk
__________________________________________

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```