st: RE: merging values of one variable to to the nearest value in another dataset (stata v10.1)

 From "Nick Cox" To Subject st: RE: merging values of one variable to to the nearest value in another dataset (stata v10.1) Date Tue, 20 Oct 2009 21:34:57 +0100

```There are various answers to this question. One is to think in terms of
-append- rather than -merge-.

Consider a strategy starting with appended files:

. l

+----------------------------+
| id   dateofv~t   dateoff~l |
|----------------------------|
1. |  1    1/1/2009             |
2. |  1   1/31/2009             |
3. |  1   2/18/2009             |
4. |  1   3/14/2009             |
5. |  1   5/30/2009             |
|----------------------------|
6. |  1   7/12/2009             |
7. |  2   3/10/2009             |
8. |  2   4/25/2009             |
9. |  2   5/20/2009             |
10. |  3   2/09/2009             |
|----------------------------|
11. |  3   6/08/2009             |
12. |  3   7/10/2009             |
13. |  1               2/22/2009 |
14. |  2               4/15/2009 |
15. |  3               7/15/2009 |
+----------------------------+

We first convert our date variables to numeric dates:

. gen visit = date(dateofvisit, "MDY")
(3 missing values generated)

. gen fill = date(dateoffill, "MDY")
(12 missing values generated)

Then combine our dates rowwise:

. gen event = max(visit, fill)

-- and get data on next and previous visits

. sort id event

. gen lastvisit = visit
(3 missing values generated)

. by id : replace lastvisit = lastvisit[_n-1] if missing(lastvisit)
(3 real changes made)

. gen nextvisit = visit
(3 missing values generated)

. by id : replace nextvisit = nextvisit[_n+1] if missing(nextvisit)
(2 real changes made)

-- and so on and so forth. So for each visit you can find the nearest
fill, and vice versa -- and keep track of the separation involved.

Nick
n.j.cox@durham.ac.uk

Ariel Linden

I have a dataset in long format which has dates when an individual
visited a
doctor. I have another dataset with dates when that patient started
taking a
medication. Naturally, the dates will not align exactly between the two
data
sets.

Is there a way of merging the files so that the date when the patient
started taking the medication is aligned (put on the same line) to the
closest date (but after) of a given doctor visit?

Example

File 1 - office visits

ID	date of visit
1	1/1/2009
1	1/31/2009
1	2/18/2009
1	3/14/2009
1	5/30/2009
1	7/12/2009
2	3/10/2009
2	4/25/2009
2	5/20/2009
3	2/09/2009
3	6/08/2009
3	7/10/2009

File 2 - medication fill date

ID 	date of first fill
1	2/22/2009
2	4/15/2009
3	7/15/2009

If I would like the medication date to be added to the office visit file
to
be matched to the last closest office visit, then ID 1 medication date
should align with 1/31/2009, ID 2 should align with 3/10/2009 and ID 3
should align with 7/10/2009.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```