Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: interweave?


From   "Michael Blasnik" <michael.blasnik@verizon.net>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Re: interweave?
Date   Sat, 29 Apr 2006 10:22:26 -0400

If I understand your problem, you want to essentially append the using dataset's multiple records of information about each pid to each pid/fid combination in the master dataset that matches on pid? If so, I think -joinby- may be helpful:

use usingdata
drop fid
save usingdata2

use masterdata
keep pid fid
bysort pid fid: keep if _n==1
joinby pid using usingdata2, unmatched(master) _merge(mrgpidevents)
* now you have multiple copies of the pid event data
* one set for each fid associated with that pid, now append onto master
append using masterdata
sort pid fid

Michael Blasnik
michael.blasnik@verizon.net

----- Original Message ----- From: <clinton.thompson@summitllc.us>
To: <statalist@hsphsun2.harvard.edu>
Sent: Friday, April 28, 2006 3:53 PM
Subject: st: interweave?



hello all,
i am using SE v. 9.1 for macintosh.
i have two datasets where each contains an identifying (not necessarily
unique) variable (PID).  in the "master" dataset, this PID may contain
multiple sub-identifiers (FID), whereas in the "using" dataset only PID
exists.  each dataset contains what may be characterized as an "event
history" in -long- format w/ the event history in the "master" dataset
being FID-specific whereas in the "using" dataset it is PID-specific.  i
need to interweave, so to speak, the event history from the "using"
dataset into the event history of each FID whose PID exists in both
datasets.  briefly, a portion of the data from the "master" dataset:

 +--------------------------------+
 | pid                  fid   event_date |
 |--------------------------------|
 | 800000056       56   20sep1972 |
 | 800000056       56   03aug1999 |
 | 800000056       56   25oct1999 |
 | 800000056       56   28oct1999 |
 | 800000056       56   28mar2000 |
 | 800000056       56   05apr2001 |
 | 800000056       56   29apr2002 |
 | 800000056       56   30mar2003 |
 | 800000056       56   17nov2004 |
 |--------------------------------|
 | 800000056   215891   25oct1999 |
 | 800000056   215891   28oct1999 |
 | 800000056   215891   29sep2003 |
 | 800000056   215891   30mar2004 |
 | 800000056   215891   17nov2004 |
 | 800000056   215891   23mar2005 |
 |--------------------------------|

And from the using dataset:

 +--------------------------------+
 | pid                   fid   effect_date |
 |--------------------------------|
 | 800000056        .   01 Oct 90 |
 | 800000056        .   01 Oct 94 |
 | 800000056        .   01 Oct 95 |
 | 800000056        .   28 Jan 03 |
 | 800000056        .   01 Nov 03 |
 | 800000056        .   03 Feb 06 |
 | 800000056        .   16 Feb 06 |
 +--------------------------------+


as it stands now, the "using" data is appended to the "master" data and i
can successfully expand the data to accomodate the additional number of
using records that must be incorporated into the event histories of the
"master" FIDs.  the problem i'm having, however, is in figuring out a way
to replace the missing FID values from the "using" dataset so that the
"using" event histories are incorporated into the "master" FID event
histories.

i've searched the online- and print-documentation but to little avail (the
standard -merge-, -append-, and -joinby- commands don't seem to accomplish
precisely what i need here).  i suspect this may involve some use of _n or
_N, but alas, a solution eludes me.  any suggestions??

many thanks,
clint
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index