Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: merging values of one variable to to the nearest value in another dataset (stata v10.1)


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: merging values of one variable to to the nearest value in another dataset (stata v10.1)
Date   Tue, 20 Oct 2009 21:34:57 +0100

There are various answers to this question. One is to think in terms of
-append- rather than -merge-. 

Consider a strategy starting with appended files: 

. l

     +----------------------------+
     | id   dateofv~t   dateoff~l |
     |----------------------------|
  1. |  1    1/1/2009             |
  2. |  1   1/31/2009             |
  3. |  1   2/18/2009             |
  4. |  1   3/14/2009             |
  5. |  1   5/30/2009             |
     |----------------------------|
  6. |  1   7/12/2009             |
  7. |  2   3/10/2009             |
  8. |  2   4/25/2009             |
  9. |  2   5/20/2009             |
 10. |  3   2/09/2009             |
     |----------------------------|
 11. |  3   6/08/2009             |
 12. |  3   7/10/2009             |
 13. |  1               2/22/2009 |
 14. |  2               4/15/2009 |
 15. |  3               7/15/2009 |
     +----------------------------+

We first convert our date variables to numeric dates: 

. gen visit = date(dateofvisit, "MDY")
(3 missing values generated)

. gen fill = date(dateoffill, "MDY")
(12 missing values generated)

Then combine our dates rowwise: 

. gen event = max(visit, fill)

-- and get data on next and previous visits 

. sort id event

. gen lastvisit = visit
(3 missing values generated)

. by id : replace lastvisit = lastvisit[_n-1] if missing(lastvisit)
(3 real changes made)

. gen nextvisit = visit
(3 missing values generated)

. by id : replace nextvisit = nextvisit[_n+1] if missing(nextvisit)
(2 real changes made)

-- and so on and so forth. So for each visit you can find the nearest
fill, and vice versa -- and keep track of the separation involved. 

Nick 
n.j.cox@durham.ac.uk 

Ariel Linden

I have a dataset in long format which has dates when an individual
visited a
doctor. I have another dataset with dates when that patient started
taking a
medication. Naturally, the dates will not align exactly between the two
data
sets. 

Is there a way of merging the files so that the date when the patient
started taking the medication is aligned (put on the same line) to the
closest date (but after) of a given doctor visit?

Example

File 1 - office visits

ID	date of visit
1	1/1/2009
1	1/31/2009
1	2/18/2009
1	3/14/2009
1	5/30/2009
1	7/12/2009
2	3/10/2009
2	4/25/2009
2	5/20/2009
3	2/09/2009
3	6/08/2009
3	7/10/2009

File 2 - medication fill date

ID 	date of first fill
1	2/22/2009
2	4/15/2009
3	7/15/2009

If I would like the medication date to be added to the office visit file
to
be matched to the last closest office visit, then ID 1 medication date
should align with 1/31/2009, ID 2 should align with 3/10/2009 and ID 3
should align with 7/10/2009. 


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index