[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Winter <nwinter@virginia.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: AW: st: How to create a random number identifier number |

Date |
Thu, 12 Nov 2009 17:52:32 -0500 |

by hhid: replace sortvar=sortvar[1] - Nick Winter Anna Reimondos wrote:

I sucessfully implemented the solution proposed, and checked that these were in fact unique identifiers. However I then ran into another problem, when trying to do a similar thing for households! Each of the 11,000 people live in households (around 5,000 households in total) and there is a unique 5 digit household identifier which can be used to see which people live in the same household. In other words, several persons (identified by personid) live in the same household (hhid). In the same way as I did for the "personid" I would also like to create a new household identifier, that has five digits and is unique. Example: person hhid "newhhid" 1 25643 13584 2 25643 13584 3 68534 34257 I tried modifying the code for the person id, and applying it to the household id but this does not work because I can't randomly sort them using the 'sortvar' variable, because it then loses the natural ordering of the same household being on consecutive lines. My current solution works I think but it means I keep only one line per household, save off a new dataset, randomly sort it , create the new identifier and then merge it back in. ...Would there be a way to do it , while still "staying" in the original dataset? *----------------------------------------------------------------------------------------------- *Save dataset capture drop sortvar //As before- random number for random sorting gen sortvar=1 + int(12759*uniform()) replace sortvar=sortvar+10000 if sort<10000 bysort hhid: gen numbers=_N //How many people live in the household keep hhid numbers sortvar bysort ehhrhid: gen first=_n if _n==1 //Identify the 1 observation in each household keep if first==1 //keep only 1 observation (first) per household sort sortvar //randomly sort the data gen newhhid =_n //new household Id replace newhhid=newhhid+100000 if newhhid<=10000 expand numbers //Expand so each household has as many rows as people in household sort ehhrhid *Merge back this dataset using hhid, into the original dataset. *----------------------------------------------------------------------------------------------- My original problem has been solved, and my current solution kind of works but I would be interested to hear if any one has a more elegant way of doing this... Thanks very much, Anna On Fri, Nov 13, 2009 at 6:10 AM, Michael McCulloch <mm@pinest.org> wrote:Thanks Martin. I imagine there's also a simpler (i.e. more elegant) way to also create the 5-digit new id than this?: replace newpersonid=newpersonid+50000 if newpersonid<11000 On Nov 12, 2009, at 12:11 AM, Martin Weiss wrote:<> The -destring- line could easily be omitted, without loss of functionality... HTH Martin -----Ursprüngliche Nachricht----- Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Michael McCulloch Gesendet: Donnerstag, 12. November 2009 04:16 An: statalist@hsphsun2.harvard.edu Betreff: Re: st: How to create a random number identifier number Anna, This simulated example is a better approach, that is faithful to your need for the newpersonid to have 5 digits. Michael ********* begin example clear set obs 11000 gen personid=_n replace personid=personid+10000 if personid<10000 gen sortvar=1 + int(11000*uniform()) replace sortvar=sortvar+10000 if sort<10000 sort sortvar gen newpersonid str5=_n destring newpersonid, replace replace newpersonid=newpersonid+50000 if newpersonid<11000 list personid newpersonid in 10050/11000 codebook ********* end example Dear Anna, if you sort on some variable other than personid, or perform a random sort, you could then: gen new_personid = _n This creates a variable which has a value equal to the sequence # of that record, which is why you have to create some sort order other than personid. Michael On Nov 11, 2009, at 6:37 PM, Anna Reimondos wrote:Hello, I am experiencing problems creating a unique set of number for my dataset. I have a dataset with around 11,000 subjects or persons, and each one of these subjects has a unique identifier that is 5 digits long (personid). I need to create a concordance file which list the original 5 digit "personid" and matches this to another new randomly created identifier for each person. This new identifier (new_personid) also has to be 5 digits long. Example: personid new_personid 10526 35624 18594 21893 54632 12489 I have tried playing around with the gen x = uniform() function but to no avail. I am unable to create exactly 11,000 unique numbers with 5 digits. I also tried just using the egen x=se() command, but then the ids are sequential and not random and I am afraid then perhaps someone could figure out how to match the personid and the newperson id.... Any help would be much appreciated, Thanks Anna (Using STATA 10.1, Windows Vista) * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/Michael McCulloch Pine Street Foundation 124 Pine Street San Anselmo, CA 94960-2674 tel: 415-407-1357 fax: 206-338-2391 mm@pinestreetfoundation.org * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/Michael McCulloch Pine Street Foundation 124 Pine Street San Anselmo, CA 94960-2674 tel: 415-407-1357 fax: 206-338-2391 mm@pinestreetfoundation.org * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

-- -------------------------------------------------------------- Nicholas Winter 434.924.6994 t Assistant Professor 434.924.3359 f Department of Politics nwinter@virginia.edu e University of Virginia faculty.virginia.edu/nwinter w PO Box 400787, 100 Cabell Hall Charlottesville, VA 22904 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: AW: st: How to create a random number identifier number***From:*Michael McCulloch <mm@pinest.org>

**Re: AW: st: How to create a random number identifier number***From:*Anna Reimondos <areimondos@gmail.com>

- Prev by Date:
**Re: AW: st: How to create a random number identifier number** - Next by Date:
**st: Unable to get mfx, predict(p) after mim: svy: logit** - Previous by thread:
**Re: AW: st: How to create a random number identifier number** - Next by thread:
**st: is there a -hexdump- command for variables?** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |