[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Austin Nichols" <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Data management problem |

Date |
Tue, 6 May 2008 16:29:55 -0400 |

David W. Harless: Here is another approach (cut and paste to Command window): clear tempfile p1 p2 input beg1 end1 p1 15522 15674 1 15887 16617 1 end ren beg1 date1 ren end1 date2 g id=1 g i=_n reshape long date, i(i) j(t1) la def t 1 "begin" 2 "end" la val t1 t format date %d save `p1' clear input beg2 end2 p2 15522 15674 1 15979 16436 1 16557 16587 1 end ren beg2 date1 ren end2 date2 g id=1 g i=_n reshape long date, i(i) j(t2) la def t 1 "begin" 2 "end" la val t2 t format date %d save `p2' joinby id date using `p1', unm(both) update drop i _m sort id date by id: g inp1=(sum(t1==1)>0)-(t1==2)>0 by id: replace inp1=0 if inp1[_n-1]==0&mi(p1) by id: g inp2=(sum(t2==1)>0)-(t2==2)>0 by id: replace inp2=0 if inp2[_n-1]==0&mi(p2) bys id (date): g enddate=date[_n+1] format enddate %d order id date enddate t1 t2 p1 p2 inp1 inp2 ren date startdate drop t1 t2 p1 p2 drop if mi(end) g david=max(inp1,2*inp2) li, noo clean > > David W. Harless > > > > > > > > I have two data sets containing dates of participation in two related > > programs, program 1 > > and program 2. But the complication is that the program 1 data set lists > > dates if the > > participant is enrolled in *either* program 1 or program 2. Dates in the > > program 2 data > > set indicate definite participation in program 2. > > > > The best explanation is an example. Here is program participation dates > > from the program > > 1 data set. (Date variables have display format %dD_m_Y and I added the > > program variable > > to make this explanation clearer): > > > > beg1 end1 program > > 01 Jul 02 30 Nov 02 1 > > 01 Jul 03 30 Jun 05 1 > > > > And the same individual for the program 2 data set: > > > > beg2 end2 program > > 01 Jul 02 30 Nov 02 2 > > 01 Oct 03 31 Dec 04 2 > > 01 May 05 31 May 05 2 > > > > > > I want to combine these records to obtain a data set that looks like: > > > > beg end program > > 01 Jul 02 30 Nov 02 2 > > 01 Dec 02 30 Jun 03 0 > > 01 Jul 03 30 Sep 03 1 > > 01 Oct 03 31 Dec 04 2 > > 01 Jan 05 30 Apr 05 1 > > 01 May 05 31 May 05 2 > > 01 Jun 05 30 Jun 05 1 > > > > (where the 0 indicates the individual did not participate in either program > > during that > > period). > > > > There are, of course, many individuals with varying dates of participation > > in one or both > > programs. Any suggestions as to how one might solve this problem? * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Data management problem***From:*"David W. Harless" <dwharles@vcu.edu>

**References**:**Re: st: Data management problem***From:*n j cox <n.j.cox@durham.ac.uk>

**Re: st: Data management problem***From:*"Austin Nichols" <austinnichols@gmail.com>

- Prev by Date:
**st: n in probability functions** - Next by Date:
**st: RE: n in probability functions** - Previous by thread:
**Re: st: Data management problem** - Next by thread:
**Re: st: Data management problem** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |