Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Sarah Edgington" <sedging@ucla.edu> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | st: RE: Merging data sets based on a range of dates |
Date | Tue, 6 Aug 2013 11:41:21 -0700 |
Joe, What strategy you use for this depends somewhat on what you want to retain in the final dataset. If you're in a situation where you only care about the cases that match between the two datasets it's pretty straightforward with -joinby- followed by a date check. Try something like this: use surgery.dta *use joinby to create every pairwise combination between the datasets. joinby PatientID using hospital.dta , unmatched(master) *unmatched(master) keeps all the original surgery data so you can see if you have patient ids that don't match any hospitalization, which would probably indicate a data problem *now keep cases where the surgery date is within the hospitalizations dates keep if SurgeryDate>=AdmissionDate & SurgeryDate<=DischargeDate At the end of that you'll want to check that you still have as many surgeries as you expect since anything that doesn't match a hospitalization date range will have been dropped. The matching gets more complicated if you want to keep surgeries or hospitalizations that don't match, but if your goal is just to keep the cases that match across the two sets, then this should work. -Sarah -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Joe Canner Sent: Tuesday, August 06, 2013 11:15 AM To: statalist@hsphsun2.harvard.edu Subject: st: Merging data sets based on a range of dates Dear Stata experts,