[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Austin Nichols" <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Generating a unique ID |

Date |
Wed, 21 Nov 2007 20:39:47 -0500 |

Mansour Farahani <mfarahan@hsph.harvard.edu> : Note that Friedrich Huebler addressed the problem you posed, namely an id "so that I have a UNIQUE ID for every row in the data" not the one you now seem to desire, namely an id for every row that might be in the data. There are a few ways to do the latter. One is to -fillin- and then use Friedrich's -egen- approach, then (optionally) -drop- the artificial observations. Another approach is to concatenate as Friedrich suggested, and you requested--did you try his second suggestion? Here are both approaches illustrated: clear input state year age sex residence 1 1970 1 1 2 1 1970 1 2 1 1 1970 1 2 2 1 1970 2 2 1 1 1970 2 2 2 1 1970 3 1 1 1 1970 3 2 1 1 1970 3 2 2 end egen id2=concat(state year age sex residence) fillin state year age sex residence li if _fillin, noo egen id1=group(state year age sex residence) drop if _fillin drop _fillin li, noo On Nov 21, 2007 7:15 PM, Mansour Farahani <mfarahan@hsph.harvard.edu> wrote: > Friedrich, > Thank you very much for your help. The point is that in my dataset the missing values are not coded as missing, instead the whole row is not there. > state year age sex residence > 1 1970 1 1 2 > 1 1970 1 2 1 > 1 1970 1 2 2 > 1 1970 2 2 1 > 1 1970 2 2 2 > 1 1970 3 1 1 > 1 1970 3 2 1 > 1 1970 3 2 2 > > As you can see in the above, age 1 sex 1 and residence 1 is missing but it is not explicitly coded in the dataset. as a result, when I use egen group( ) I have a continuous set of numbers for ID ( with no jump when a record is missing). > thanks again, > Mansour > >>> "Friedrich Huebler" <fhuebler@gmail.com> 11/21/07 6:49 PM >>> > > Mansour, > > You can use -egen group- if you only want a unique ID that does not > necessarily contain information on the underlying data. > > . egen id = group(state year age sex residence) > > Missing data in any of the five variables will result in a missing > value for the ID. Another option, closer to what you suggested, is to > concatenate the identifying variables. > > . gen id = string(state) + " " + string(year) + " " + string(age) + " > " + string(sex) + " " + string(residence) > > Friedrich > > On Nov 21, 2007 6:22 PM, Mansour Farahani <mfarahan@hsph.harvard.edu> wrote: > > Dear Statalisters: > > I have a unbalanced dataset where variables of interest, categorized by age and place of residence (urban rural) in 15 age groups, are observed in 15 states, over 33 years. For example, mortality rate for rural boys age 10-15 in state i at time t. some of categories in certain years and states are (randomly?) missing. > > > > I want to create a unique ID number based on state (1 to 15), year (1 to 33), age (1 to 15), sex (1,2),and place of residence (1,2), so that I have a UNIQUE ID for every row in the data. One possible format is that the ID have a max of 8 digits, xx xx xx x x such that first 2 digits are state (00 to 15) then year and so on. > > > > I appreciate any idea on how I can do it in Stata. > > > > Many thanks, > > > > Mansour > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: st: Generating a unique ID***From:*"Mansour Farahani" <mfarahan@hsph.harvard.edu>

- Prev by Date:
**Re: st: discrete time-varying covariate in cox models** - Next by Date:
**Re: st: Generating a unique ID** - Previous by thread:
**Re: st: Generating a unique ID** - Next by thread:
**Re: st: Generating a unique ID** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |