Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: merging spell data to episode data

From   "Steve Stillman" <>
To   <>
Subject   RE: st: merging spell data to episode data
Date   Thu, 13 Jul 2006 15:58:27 +1200

Another option depending on the size of your datasets is to expand your
spell data so you have one observation per day.  You can then merge this
directly into your episode data, keeping only the observations that
match.  Below is sample code that should accomplish this.  


gen start = date(startd, "md19y")
gen end = date(endd, "md19y")
gen length = end - start + 1

expand length
gen date2 = start
bys start: replace date2 = date2[_n-1] + 1 if _n~=1

sort date2
save newspell, replace

use episode 
gen date2 = date(date, "md19y")
sort date2
merge date2 using newspell, nokeep

-----Original Message-----
[]On Behalf Of
Sent: Thursday, July 13, 2006 6:16 AM
Subject: Re: st: merging spell data to episode data

How about using -joinby- to locate the observations where date is 
between start and end and then -merge- to pick up the unmatched spells.

For example:

input id  str10 startd  str10 endd             othervars
1 "jan1,2006"  "mar15,2006"         5
1 "mar16,2006"  "jun1,2006"         13
1 "jun2,2006"  "mar1,2007"         7
2 "feb7,2006"  "mar3,2006"        6
2 "jun4,2006"  "sep5,2006"       6

gen start = date(startd, "md19y")
gen end = date(endd, "md19y")
format start %d
format end %d

drop startd endd
sort id start
tempfile temp1
save `temp1'

input id str10 date         otherstuff
1 "feb1,2006" 7
1 "feb2,2007" 8
2 "mar3,2006" 3
gen date2 = date(date, "md19y")
format date2 %d

joinby id using `temp1'
gen tag = start if date2 <= end & date2 >=start
format tag %d
keep if tag != .
keep id date2 otherstuff tag
rename tag start
sort id start

merge id start using `temp1', 
sort id start


----- Original Message -----
From: Amanda Kowalski <akowalsk@MIT.EDU>
Date: Wednesday, July 12, 2006 9:39 am
Subject: st: merging spell data to episode data

> Statalist,
> I am trying to merge spell data to episode data.  For example, my 
> spell data
> looks something like this (with dates formatted as dates in Stata):
> id 	startd 	endd                	othervars
> 1	jan1,2006	mar15,2006        	5
> 1	mar16,2006	jun1,2006        	13
> 1	jun2,2006	mar1,2007        	7
> 2	feb7,2006	mar3,2006        	6
> 2	jun4,2006	sep5,2006        	6
> (None of the spells overlap, but there are gaps).
> And my episode data looks somthing like this:
> id	date        	otherstuff
> 1	feb1,2006	7
> 1	feb2,2007	8
> 2	mar3,2006	3
> For each episode, I'd like to merge on the data from the spell 
> that covers
> it.  Ideally, there would be a merging rule that would merge by id 
> if date
> is in the range specified by startd and endd, but I can't find any 
> commandsthat do this.
> I have searched the stata help files, the internet, and the statlist
> archives.  Do you have any suggestions?  My inelegant ideas include:
> Turn the spell data into date data by making one observation for 
> every date,
> retaining startd and endd with every observation, and then merge.  
> Thisisn't a good plan because my dataset is very large.
> Use "joinby" to merge every pairwise combination of observations 
> by id.
> Then, if date isn't between startd and endd, discard the observation.
> However, I'd prefer to retain unmatched spells and episodes, and this
> strategy would make it harder to locate them.
> Any suggestions?
> Thanks,
> Amanda

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2021 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index