Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: RE: Merging data sets based on a range of dates


From   Joe Canner <[email protected]>
To   "[email protected]" <[email protected]>
Subject   st: RE: RE: Merging data sets based on a range of dates
Date   Tue, 6 Aug 2013 19:14:36 +0000

Sarah,

Thanks very much; I think that might do the trick.  I had considered -joinby- but assumed that "all possible combinations" would generate an unwieldy number of cases.  However, it wasn't so bad.  I have used -joinby- in the past, but it never seemed to do what I expected it to do.  I think the use of "unmatched" might be the key to success.

Cheers,
Joe

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Sarah Edgington
Sent: Tuesday, August 06, 2013 2:41 PM
To: [email protected]
Subject: st: RE: Merging data sets based on a range of dates

Joe,
What strategy you use for this depends somewhat on what you want to retain in the final dataset.
If you're in a situation where you only care about the cases that match between the two datasets it's pretty straightforward with -joinby- followed by a date check.  

Try something like this:

use surgery.dta

*use joinby to create every pairwise combination between the datasets.  
joinby PatientID using hospital.dta , unmatched(master)
*unmatched(master) keeps all the original surgery data so you can see if you have patient ids that don't match any hospitalization, which would probably indicate a data problem

*now keep cases where the surgery date is within the hospitalizations dates keep if SurgeryDate>=AdmissionDate & SurgeryDate<=DischargeDate

At the end of that you'll want to check that you still have as many surgeries as you expect since anything that doesn't match a hospitalization date range will have been dropped.
The matching gets more complicated if you want to keep surgeries or hospitalizations that don't match, but if your goal is just to keep the cases that match across the two sets, then this should work.

-Sarah


-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Joe Canner
Sent: Tuesday, August 06, 2013 11:15 AM
To: [email protected]
Subject: st: Merging data sets based on a range of dates

Dear Stata experts,



© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index