Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Create common ID # from several changing ID #s


From   Robert Picard <picard@netbox.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Create common ID # from several changing ID #s
Date   Fri, 3 Dec 2010 16:34:40 -0500

Andrea,

Here's a way to this using the -group_id- program available from SSC. Type:

ssc d group_id

for more information. Hope this helps,

Robert

*--------------------------- begin example -----------------------
version 11
clear
input long (current_id other_id1 other_id2)
4001 . .
4002 406434127 .
406434166 406434127 .
406434127 406434166 406414087
406414087 406384157 .
406384157 . .
4071 . .
4072 4071 .
4073 . .
end

* Assign a new identifier and save the master list of providers
sort current_id
gen newid = _n
list, noobs sep(0)
tempfile fmain
save "`fmain'"

* Duplicate records when there are alternate identifiers
keep newid other_id1
rename other_id1 current_id
keep if !missing(current_id)
tempfile alt1
save "`alt1'"

use "`fmain'", clear
keep newid other_id2
rename other_id2 current_id
keep if !missing(current_id)
append using "`alt1'"
append using "`fmain'"
keep newid current_id
sort newid current_id
list, sepby(newid)

* Group newid when current_di matches
group_id newid, matchby(current_id)
sort newid
list, sepby(newid)

* Remove duplicates and update the initial data with new codes
sort current_id newid
by current_id newid: keep if _n == 1
merge 1:1 current_id using "`fmain'", assert(match) nogen

list, noobs sep(0)

*--------------------------- end example -----------------------


On Fri, Dec 3, 2010 at 1:24 PM, Andrea Drechsler
<andreamm@wharton.upenn.edu> wrote:
> Hello,
>
> I have a dataset of medical providers at the provider-year level.
> Each provider has an ID #.  If the provider moves to a new location,
> they are assigned a new ID #.  I have two other variables that
> sometimes contain providers' previous or future ID #s.  I'm trying to
> create a common ID # for each provider so I can track them over time.
>
> For example, I'd like to assign all of the following observations a common ID #:
>
> current_id              other_id1               other_id2
> 406434166               406434127
> 406434127               406434166               406414087
> 406414087               406384157
> 406384157
>
> Thanks very much in advance!
>
> ~ Andrea Drechsler
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index