Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: < and > operand in recode


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: < and > operand in recode
Date   Sun, 20 Aug 2006 17:51:51 +0100

Terminology appears to be a small problem here. 

I understand = to indicate equality and >, >=, < or <= 
to indicate inequality. Your contradictory usage
is rather surprising. 

That aside, the key point is that -recode- is announced 
as for recoding categorical variables, meaning 
in practice categorical variables coded as 
integers. 

-recode- does allow many-to-one mappings, but it 
really is not a good idea to use it for re-coding 
a continuous variable. Even though your work-around 
apparently worked for you, it is no more than 
a work-around. Also, there are plenty of possible
values between 0 and 0.0001, etc., and testing 
for equality and inequality with a decimal fraction
is usually problematic. 

Now Stata as such doesn't really have any idea
of what a categorical variable is, and thus does 
not declare your use to be an error, although
there are several good arguments for strictness
in such matters (or at least for a -force- option 
which shows that you realise exactly what 
you are doing). 

For your coding a perfectly respectable 
approach is 

gen newvar = 1 if var <= 1
replace newvar = 2 if var <= 2 & missing(newvar) 
replace newvar = 3 if var <= 5 & missing(newvar) 
replace newvar = 4 if var <= 10 & missing(newvar) 
replace newvar = 5 if var < . & missing(newvar) 
replace newvar = . if var == . 

That may look long-winded, but it is perfectly 
explicit and easy to understand.  

Another perfectly respectable approach is 
make use of -inrange(,)-:

gen newvar = 1 if inrange(var,.,1) 
replace newvar = 2 if inrange(var,1,2) & missing(newvar) 
replace newvar = 3 if inrange(var,2,5) & missing(newvar) 
replace newvar = 4 if inrange(var,5,10) & missing(newvar) 
replace newvar = 5 if inrange(var,10,.) & missing(newvar) 
replace newvar = . if var == . 

although with -inrange()- it is not so transparent 
what happens in the case of equality with either 
argument. See the help for -inrange()-. 

Yet another perfectably respectable approach is to 
make use of -cond()-. 

gen newvar = cond(var <=  1, 1, 
		 cond(var <=  2, 2, 
             cond(var <=  5, 3, 
		 cond(var <= 10, 4, 
		 cond(var <   ., 5, .)))))

That is all one command. Careful layout and use
of a good text editor to check balanced parentheses 
are recommended. 

Personally, for your example problem, I like -cond()- best. 

For a discursive tutorial see

SJ-5-3  pr0016  . . Depending on conditions: a tutorial on the cond() function
        . . . . . . . . . . . . . . . . . . . . . . .  D. Kantor and N. J. Cox
        Q3/05   SJ 5(3):413--420                                 (no commands)
        tutorial on the cond() function


Nick 
[email protected] 

b. water
 
> Stata 8.2,
> 
> i wanted to recode a variable, which consisted of continuous 
> number, something to the effect of:
> 
> <=1 coded 1 (<= i.e. meaning less than or equal to)
> >1 to <=2 coded 2
> >2 to <= 5 coded 3
> >5 to <=10 coded 4
> >10 coded 5
> 
> when i tried to use the equality operands (i.e. < or > in my 
> recode commands, it gives an error message 'unknown el <2 in 
> rule') so after consulting my manual on [R] recode, i managed 
> by recoding: 
> 
> 0.0001/1 = 1
> 1.0001/2 = 2
> .
> .
> 10/1000 = 5
> etc
> 
> being careful to make sure that the parameters included all 
> the values.
> 
> i would appreciate if someone could confirm that equality 
> sign cannot be used in recode. would appreciate it too if 
> anyone can point out an alternative/better way to accomplish 
> the recode.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index