Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Barry Quinn <b.quinn@qub.ac.uk> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: Unique identifier from a string name |
Date | Thu, 24 Nov 2011 16:13:16 +0000 |
Thanks Nick/Maarten that helps a lot Barry Quinn -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Maarten Buis Sent: 24 November 2011 16:03 To: statalist@hsphsun2.harvard.edu Subject: Re: st: Unique identifier from a string name On Thu, Nov 24, 2011 at 4:10 PM, Barry Quinn wrote: > The context of the problem is to build a panel from yearly data using firm names as the unique id with the -merge- command. One solution is first create a file with all firms, create the unique identifier, and merge these identifier on to all subsequent files. Say you have three years stored in files called year1 year2 year3, and the firm name is stored in variable firm: *---------- begin example ---------- // stack all files use year1 forvalues i = 2/3 { append using year`i' } // keep only the firm names keep firm // we only need one observation per firm bys firm : keep if _n == 1 // create the unique id gen firmid = _n // save this key in a file save idkey, replace // add the id to each dataset forvalues i = 1/3 { use year`i' merge 1:1 firm using idkey // every firm in year`i' got an id assert _merge != 1 // not all firms have to appear in year `i' drop if _merge == 2 // _merge is no longer necessary drop _merge // I never overwrite the original data // hence a new filename save year`i'_id, replace } *------------ end example ------------ The only problem with this approach is that it assumes that the variable firm contains no typos, that there are no legitimate (or illegitimate) alternative spellings and/or abbreviations, and that the firm names remained constant. In practice that is highly unlikely, so I would carefully check the idkey file before merging. Hope this helps, Maarten -------------------------- Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl -------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/