Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: st: Merging 2 Tricky Panel Datasets


From   Joerg Luedicke <[email protected]>
To   [email protected]
Subject   Re: st: RE: st: Merging 2 Tricky Panel Datasets
Date   Mon, 14 Mar 2011 21:13:58 -0400

On Mon, Mar 14, 2011 at 5:29 PM, Clifton Chow
<[email protected]> wrote:

>
> A. Interview date - This is matched identically on both datasets, but the format for dataset 1 = mo/day/year and for  dataset 2 = month, day and year are broken out into separate variables.
>
> dataset 1                                  dataset 2
>
> obs 1  04 12 09                      obs 1   04/12/2009
> obs 2  12 14 10                      obs 2   12/14/2010

> B.  Interview sequence:  This is the tricky part.  Dataset 1 has a variable denoting interview sequence from 1- 9, but dataset 2 has interview sequence variable from 1 - 10, with 10 being the final interview conducted before discharge that can map on to the final interview recorded in dataset 1.
>
> Dataset 1                              Dataset 2
>
> ID        Seq             ID        Seq
> obs 1     1               obs 1      1
> obs 1     2               obs 1      2
> obs 2     1               obs 2      1
> obs 2     2               obs 2      2
> OBS 2    3               OBS2     10
>
> This means for individuals from dataset 2 without a sequence number 10, everything lines up perfectly between the two datasets (1-9).  But for those with a sequence number 10, it can map on to any possible datapoint in dataset 1, depending on which is the individual's final interview as recorded in dataset 1.
>
> Does anyone have a program (either forloop or if statement) that can handle datapoint 10 from dataset 2 so I can still successfully merge both datasets without losing significant data from individuals who were discharged (those with datapoint 10)?


RE A, type -help date- for how Stata deals with dates and times and
how you can convert from numeric into dates and vice versa. For
instance you could change the date from your dataset2 into 3 variables
as in dataset 1 and then merge accordingly.

RE B, this is probably easier if I understand your problem correctly.
In dataset1, you can simply replace the last observation in the
sequence with 10 or replace the 10 in dataset2 with the previous
number in the sequence plus 1. For the first you could write something
like:

gen seq2=Seq
sort ID seq
bys ID: replace seq2=10 if _n==_N

For the second option it could be:

gen seq3=Seq
sort ID seq
bys ID: replace seq3=[_n-1]+1 if _n==_N

hth,

J.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index