Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: how to eliminate characters from string variables


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: how to eliminate characters from string variables
Date   Thu, 14 Nov 2002 19:42:01 -0000

Radu Ban wrote 
> 
> I have a string variable that has observation of the type
> C1.A
> C1.B
> C1 C
> C1D
> C1e
> 
> For my purposes C1.A, C1 A, C1A, and C1a are the same so I 
> would like to
> eliminate all dots "." and spaces " " and lower cases from 
> my observations.
> I noticed the subinstr, and upper, command which looks like 
> 'my friends' but
> when I try:
> 
> for var r*fcode b*fcode: replace X = upper(X)\
> 
> replace X = subinstr(X,".","",1)\
> 
> replace X = subinstr(X," ","",1);               (note that 
> delimiter is ;)
> 
> I get the command to run over the first variable but 
> something goes wrong
> with the third command:
> 
> ->  replace r1fcode = upper(r1fcode)
> (3 real changes made)
> 
> ->  replace r1fcode = subinstr(r1fcode,".","",1)
> (61 real changes made)
> 
> ->  replace r1fcode = subinstr(r1fcode," `","' `",1)\ tab r1fcode"'
> Unknown function ()
> r(133);
> 
> For some reason, Stata puts in a ` between the " " in the 
> third command. Can
> anyone point out my mistake here. 

(What's that extra -tab- stuff?) 

I think in essence you've overloaded -for-'s little brain 
and it's confused. 

I am not sure what it is, but there are at least two 
possibilities. 

1. Quoted strings with only spaces can be problematic. 
-for- tries to take your quoted strings and handle 
them carefully using compound double quotes `" "' 
but it doesn't always succeed. 

2. The use of the delimiter ; could be problematic. 
My guess is that there is a conflict and that 
-for- is misinterpreting the semi-colon. 

Overloading -for- and getting into a mess has been 
a very frequent reason for postings to this list 
over the years. -for- has been a handy Swiss army 
knife, for me too, but it's not robust enough to be able to cope 
with all but relatively simple problems, and now 
there are better tools. 

The very often repeated advice, as in 90% of -for- 
problems I see, is use -foreach- and/or -forvalues-. 

I coded up a simpler version of your -for- 
code and (1) despite tweaking it in various ways,
(2) running traces, and (3) having some knowledge of 
how -for- works, I ran into the same brick 
wall as you did, or similar ones. 

I coded up this with -foreach- and it 
worked first time: 

foreach v of var code { 
	replace `v' = upper(`v')
	replace `v' = subinstr(`v',".","",1)
	replace `v' = subinstr(`v'," ","",1)
}       

P.S. Avoid semi-colons as delimiters, unless 
you are a C programmer or Roger Newson.  

Nick 
n.j.cox@durham.ac.uk 
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index