Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

recode, was: Re: st: Re: multiple )))brackets, is there a more efficient way?


From   "Annelies Vos" <a.vos@erasmusmc.nl>
To   statalist@hsphsun2.harvard.edu
Subject   recode, was: Re: st: Re: multiple )))brackets, is there a more efficient way?
Date   Tue, 25 May 2004 16:28:48 +0200 (MEST)

recoding might work (you may have noticed that I'm not a stata expert),
but like Nick Cox mentioned, it's important to me that I can easily look
up what I did months later, and for that I don't think recoding is very
clear.
But, what would happen if i would use recode, and some of the old country
specific codes are the same as the new origin (country group)codes? (this
is, in fact, the case).

example (fake):
natio 1 = south africa
natio 2 = somalia
natio 3 = argentina
natio 4 = russia
natio 5 = slovenia
natio 6 = the netherlands
-gen origin = natio
-recode origin 1 2=4 3=3 4 5=2 6=1
now I should have:
origin 1 (netherlands) includes natio 6(netherlands)
origin 2 (eastern europe) includes natio 4, 5 (russia, slovenia)
origin 3 (latin america) includes natio 3 (argentina)
origin 4 (africa) includes natio 1, 2 (south africa, somalia)

but I think I would have:
origin 1 does not exist, it's overwritten because it first became 4, but
then 4 was recoded in 2, so that now origin 2 (eastern europe) includes
not only natio 4 and 5, but also natio 1 and 2 (africa)...
etc.

of course it's easy to check this out in a little fake dataset. But the
point I'd like to make is that if this is true, recode is not suitable for
data that are not coded in the same sequence as you want to group them in
(in my example the two neighbouring countries do not have neighboring
codes).

anyhow, thanks everybody for all those replies, some of them may be useful
for future problems.
Annelies

> OR just use a recode or is this missing the point?
>
> gen origin = natio
> recode origin 3=7 8 12 69 139 141=10 14=8 ... etc
>
> cheers
> Ade
>
>
>
>
>                       "Michael Blasnik"
>                       <michael.blasnik@verizon.
>                       net>
>                                                        To:
> statalist@hsphsun2.harvard.edu
>                       Sent by:
>                       owner-statalist@hsphsun2.        cc:
>                       harvard.edu                      Subject: st: Re:
> multiple )))brackets, is there a more efficient way?
>
>
>                       25-May-2004 14:45
>                       Please respond to
>                       statalist@hsphsun2.harvar
>                       d.edu
>
>
>
>
>
> The suggestion to use inlist may be a step in the right direction, but if
> you have a coding for every county in the world  or any list with more
> than
> a few dozen codings, I would think that you should create a dataset with
> the
> codings (often this coding information is available in a way that makes
> the
> dataset creation fairly straightforward) and then use -merge- to bring
> them
> in.
>
> sort nation
> merge nation using nationcodes
>
> You can always list the nationcode file to document what the mapping is.
>
> Michael Blasnik
> michael.blasnik@verizon.net
>
> ----- Original Message -----
> From: "Annelies Vos" <a.vos@erasmusmc.nl>
> To: <statalist@hsphsun2.harvard.edu>
> Sent: Tuesday, May 25, 2004 2:35 AM
> Subject: st: multiple )))brackets, is there a more efficient way?
>
>
>> Dear all,
>> in the FAQs I found the following very useful recommendation:
>> instead of:
>>            . generate byte a = 1 if y <= 20
>>            . replace a = 2 if y > 20 &  y <= 30
>>            . replace a = 3 if y > 30 & y <= 40
>>            . replace a = 4 if y > 40 & y <.
>>
>> do the following:
>>
>>            . #delim ;
>>            . generate byte a =
>>              cond(y<=20, 1,
>>              cond(y<=30, 2,
>>              cond(y<=40, 3,
>>              cond(y<., 4,
>>                . ))));
>>
>> However, the variable I want to use it for (nationality) has many
>> values (every country in the world), which should be recoded into
>> countrygroups. I don't really like the idea of having to count the
>> number of "opening brackets": "(" , to know with how many "closing
>> brackets": ")" I should end. Is there any easier solution for this?
>>
>> to explain a piece of my syntax:
>>
>> > #delim;
>> > generate byte origin =
>> > cond(natio==3, 7,
>> > cond(natio==8, 10,
>> > cond(natio==12, 10,
>> > cond(natio==14, 8,
>> > cond(natio==28, -9,
>> > cond(natio==54, 6,
>> > cond(natio==69, 10,
>> > cond(natio==82, 8,
>> > cond(natio==139, 10,
>> > cond(natio==141, 10,
>> ...etcetera
>> ...which I would like to end on another way than:
>> > . ))))))))))
>>
>> Thanks for any suggestions,
>>
>> Annelies Vos
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>
>
>
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index