Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: < and > operand in recode


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: < and > operand in recode
Date   Sun, 20 Aug 2006 19:25:00 +0100

This is indeed a further possibility. 

-irecode()- is a well-defined Stata function and 
this gives a concise one-line solution. And 
the definition is there in the help. 

I'll declare prejudices, however. -irecode()- 
is a function I rarely use, so I would 
have to look at the help to check the 
definitions. (The results run 0 up; 
an equally defensible rule is that
results run 1 up, and I would have 
to look up to see which was Stata's
choice.) Also, this is to my mind
less transparent than -cond()-. 

But these prejudices will not 
be compelling for all readers, 
and are mentioned mostly to 
explain why I didn't think of 
that. 

Nick 
n.j.cox@durham.ac.uk 

Jeph Herrin
 
> What about:
> 
> . gen newvar = irecode(var,1,2,5,10,.)+1
> 
> ?
> 
> Nick Cox wrote:
> > Terminology appears to be a small problem here. 
> > 
> > I understand = to indicate equality and >, >=, < or <= 
> > to indicate inequality. Your contradictory usage
> > is rather surprising. 
> > 
> > That aside, the key point is that -recode- is announced 
> > as for recoding categorical variables, meaning 
> > in practice categorical variables coded as 
> > integers. 
> > 
> > -recode- does allow many-to-one mappings, but it 
> > really is not a good idea to use it for re-coding 
> > a continuous variable. Even though your work-around 
> > apparently worked for you, it is no more than 
> > a work-around. Also, there are plenty of possible
> > values between 0 and 0.0001, etc., and testing 
> > for equality and inequality with a decimal fraction
> > is usually problematic. 
> > 
> > Now Stata as such doesn't really have any idea
> > of what a categorical variable is, and thus does 
> > not declare your use to be an error, although
> > there are several good arguments for strictness
> > in such matters (or at least for a -force- option 
> > which shows that you realise exactly what 
> > you are doing). 
> > 
> > For your coding a perfectly respectable 
> > approach is 
> > 
> > gen newvar = 1 if var <= 1
> > replace newvar = 2 if var <= 2 & missing(newvar) 
> > replace newvar = 3 if var <= 5 & missing(newvar) 
> > replace newvar = 4 if var <= 10 & missing(newvar) 
> > replace newvar = 5 if var < . & missing(newvar) 
> > replace newvar = . if var == . 
> > 
> > That may look long-winded, but it is perfectly 
> > explicit and easy to understand.  
> > 
> > Another perfectly respectable approach is 
> > make use of -inrange(,)-:
> > 
> > gen newvar = 1 if inrange(var,.,1) 
> > replace newvar = 2 if inrange(var,1,2) & missing(newvar) 
> > replace newvar = 3 if inrange(var,2,5) & missing(newvar) 
> > replace newvar = 4 if inrange(var,5,10) & missing(newvar) 
> > replace newvar = 5 if inrange(var,10,.) & missing(newvar) 
> > replace newvar = . if var == . 
> > 
> > although with -inrange()- it is not so transparent 
> > what happens in the case of equality with either 
> > argument. See the help for -inrange()-. 
> > 
> > Yet another perfectably respectable approach is to 
> > make use of -cond()-. 
> > 
> > gen newvar = cond(var <=  1, 1, 
> > 		 cond(var <=  2, 2, 
> >              cond(var <=  5, 3, 
> > 		 cond(var <= 10, 4, 
> > 		 cond(var <   ., 5, .)))))
> > 
> > That is all one command. Careful layout and use
> > of a good text editor to check balanced parentheses 
> > are recommended. 
> > 
> > Personally, for your example problem, I like -cond()- best. 
> > 
> > For a discursive tutorial see
> > 
> > SJ-5-3  pr0016  . . Depending on conditions: a tutorial on 
> the cond() function
> >         . . . . . . . . . . . . . . . . . . . . . . .  D. 
> Kantor and N. J. Cox
> >         Q3/05   SJ 5(3):413--420                            
>      (no commands)
> >         tutorial on the cond() function
> > 
> > 
> > Nick 
> > n.j.cox@durham.ac.uk 
> > 
> > b. water
> >  
> >> Stata 8.2,
> >>
> >> i wanted to recode a variable, which consisted of continuous 
> >> number, something to the effect of:
> >>
> >> <=1 coded 1 (<= i.e. meaning less than or equal to)
> >>> 1 to <=2 coded 2
> >>> 2 to <= 5 coded 3
> >>> 5 to <=10 coded 4
> >>> 10 coded 5
> >> when i tried to use the equality operands (i.e. < or > in my 
> >> recode commands, it gives an error message 'unknown el <2 in 
> >> rule') so after consulting my manual on [R] recode, i managed 
> >> by recoding: 
> >>
> >> 0.0001/1 = 1
> >> 1.0001/2 = 2
> >> .
> >> .
> >> 10/1000 = 5
> >> etc
> >>
> >> being careful to make sure that the parameters included all 
> >> the values.
> >>
> >> i would appreciate if someone could confirm that equality 
> >> sign cannot be used in recode. would appreciate it too if 
> >> anyone can point out an alternative/better way to accomplish 
> >> the recode.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index