Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Merging data sets based on a range of dates


From   "Sarah Edgington" <[email protected]>
To   <[email protected]>
Subject   st: RE: Merging data sets based on a range of dates
Date   Tue, 6 Aug 2013 11:41:21 -0700

Joe,
What strategy you use for this depends somewhat on what you want to retain
in the final dataset.
If you're in a situation where you only care about the cases that match
between the two datasets it's pretty straightforward with -joinby- followed
by a date check.  

Try something like this:

use surgery.dta

*use joinby to create every pairwise combination between the datasets.  
joinby PatientID using hospital.dta , unmatched(master)
*unmatched(master) keeps all the original surgery data so you can see if you
have patient ids that don't match any hospitalization, which would probably
indicate a data problem

*now keep cases where the surgery date is within the hospitalizations dates
keep if SurgeryDate>=AdmissionDate & SurgeryDate<=DischargeDate

At the end of that you'll want to check that you still have as many
surgeries as you expect since anything that doesn't match a hospitalization
date range will have been dropped.
The matching gets more complicated if you want to keep surgeries or
hospitalizations that don't match, but if your goal is just to keep the
cases that match across the two sets, then this should work.

-Sarah


-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Joe Canner
Sent: Tuesday, August 06, 2013 11:15 AM
To: [email protected]
Subject: st: Merging data sets based on a range of dates

Dear Stata experts,



© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index