Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Unique identifier from a string name


From   Maarten Buis <maartenlbuis@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Unique identifier from a string name
Date   Thu, 24 Nov 2011 17:03:05 +0100

On Thu, Nov 24, 2011 at 4:10 PM, Barry Quinn wrote:
> The context of the problem is to build a panel from yearly data using firm names as the unique id with the -merge- command.

One solution is first create a file with all firms, create the unique
identifier, and merge these identifier on to all subsequent files.

Say you have three years stored in files called year1 year2 year3, and
the firm name is stored in variable firm:

*---------- begin example ----------
// stack all files
use year1
forvalues i = 2/3 {
    append using year`i'
}

// keep only the firm names
keep firm

// we only need one observation per firm
bys firm : keep if _n == 1

// create the unique id
gen firmid = _n

// save this key in a file
save idkey, replace

// add the id to each dataset
forvalues i = 1/3 {
    use year`i'
    merge 1:1 firm using idkey

    // every firm in year`i' got an id
    assert _merge != 1

    // not all firms have to appear in year `i'
    drop if _merge == 2

    // _merge is no longer necessary
    drop _merge

    // I never overwrite the original data
    // hence a new filename
    save year`i'_id, replace
}
*------------ end example ------------

The only problem with this approach is that it assumes that the
variable firm contains no typos, that there are no legitimate (or
illegitimate) alternative spellings and/or abbreviations, and that the
firm names remained constant. In practice that is highly unlikely, so
I would carefully check the idkey file before merging.

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany


http://www.maartenbuis.nl
--------------------------
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index