[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"David W. Harless" <dwharles@vcu.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Data management problem |

Date |
Wed, 07 May 2008 12:55:50 -0400 |

Austin,

Thanks so much for your solution -- it suits the problem perfectly. The idea of (somehow) using -joinby- had occurred to me, but I made no headway as I was stuck thinking of my data in the original structure (without the -reshape long-).

Again, my great thanks.

Dave Harless

Austin Nichols wrote:

David W. Harless: Here is another approach (cut and paste to Command window): clear tempfile p1 p2 input beg1 end1 p1 15522 15674 1 15887 16617 1 end ren beg1 date1 ren end1 date2 g id=1 g i=_n reshape long date, i(i) j(t1) la def t 1 "begin" 2 "end" la val t1 t format date %d save `p1' clear input beg2 end2 p2 15522 15674 1 15979 16436 1 16557 16587 1 end ren beg2 date1 ren end2 date2 g id=1 g i=_n reshape long date, i(i) j(t2) la def t 1 "begin" 2 "end" la val t2 t format date %d save `p2' joinby id date using `p1', unm(both) update drop i _m sort id date by id: g inp1=(sum(t1==1)>0)-(t1==2)>0 by id: replace inp1=0 if inp1[_n-1]==0&mi(p1) by id: g inp2=(sum(t2==1)>0)-(t2==2)>0 by id: replace inp2=0 if inp2[_n-1]==0&mi(p2) bys id (date): g enddate=date[_n+1] format enddate %d order id date enddate t1 t2 p1 p2 inp1 inp2 ren date startdate drop t1 t2 p1 p2 drop if mi(end) g david=max(inp1,2*inp2) li, noo cleanDavid W. Harless I have two data sets containing dates of participation in two related programs, program 1 and program 2. But the complication is that the program 1 data set lists dates if the participant is enrolled in *either* program 1 or program 2. Dates in the program 2 data set indicate definite participation in program 2. The best explanation is an example. Here is program participation dates from the program 1 data set. (Date variables have display format %dD_m_Y and I added the program variable to make this explanation clearer): beg1 end1 program 01 Jul 02 30 Nov 02 1 01 Jul 03 30 Jun 05 1 And the same individual for the program 2 data set: beg2 end2 program 01 Jul 02 30 Nov 02 2 01 Oct 03 31 Dec 04 2 01 May 05 31 May 05 2 I want to combine these records to obtain a data set that looks like: beg end program 01 Jul 02 30 Nov 02 2 01 Dec 02 30 Jun 03 0 01 Jul 03 30 Sep 03 1 01 Oct 03 31 Dec 04 2 01 Jan 05 30 Apr 05 1 01 May 05 31 May 05 2 01 Jun 05 30 Jun 05 1 (where the 0 indicates the individual did not participate in either program during that period). There are, of course, many individuals with varying dates of participation in one or both programs. Any suggestions as to how one might solve this problem?* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: st: Data management problem***From:*n j cox <n.j.cox@durham.ac.uk>

**Re: st: Data management problem***From:*"Austin Nichols" <austinnichols@gmail.com>

**Re: st: Data management problem***From:*"Austin Nichols" <austinnichols@gmail.com>

- Prev by Date:
**st: Stochastic Production Frontier and Serial Correlation** - Next by Date:
**st: RE: Stochastic Production Frontier and Serial Correlation** - Previous by thread:
**Re: st: Data management problem** - Next by thread:
**st: Renaming File Names** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |