Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Recode - a cautionary tale


From   "Allan Reese (Cefas)" <allan.reese@cefas.co.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Recode - a cautionary tale
Date   Wed, 16 Sep 2009 10:54:47 +0100

A colleague used the recode function, following the example in
[U]25.1.2.  It reported some missing values, but she knew there were
some missing items.  Unfortunately some actual values also got recoded
as missing.
The command was:
  gen byte xcat = recode( x, 20, 40, 60, 80, 100, 120) and the missing
values should have been 120.  

[U]12.2.2 lists the ranges for each numeric type, which for byte is -127
to +100, but does not specify what should happen when an out of range
value is assigned. I've never had this problem because I'm too idle to
save a few bytes by specifying the type. ;-)

Tech support point out that if you don't force Stata to use a "byte"
then it will gracefully detect the out of range values and automatically
promote to the correct storage type. "But when you specify -generate
byte- you are using the advanced syntax and telling Stata that you
really want it to stay a byte no matter what values you pass it." In my
opinion the advice in 25.1.2 is too Delphic, and the comment that "we
(wisely) told Stata to generate the new variable as a byte" can be
deleted.
 
. clear

. set obs 3
obs was 0, now 3

. generate byte x = _n

. replace x = x + 200
x was byte now int
(3 real changes made)

. replace x = x + 40000
x was int now long
(3 real changes made)

. replace x = x + .5
x was long now double
(3 real changes made)

In giving advice, I had been thinking of the recode command rather than
the function: the command makes it easier to handle end intervals with
min/max.  Another option is egen using cut() which also allows
substitution of integer codes labelled with the cutpoint values.  Using
icodes makes it less likely the byte storage will be overflowed.

Allan


***********************************************************************************
This email and any attachments are intended for the named recipient only.  Its unauthorised use, distribution, disclosure, storage or copying is not permitted.  If you have received it in error, please destroy all copies and notify the sender.  In messages of a non-business nature, the views and opinions expressed are the author's own and do not necessarily reflect those of the organisation from which it is sent.  All emails may be subject to monitoring.
***********************************************************************************


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index