Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Data management problem


From   "Austin Nichols" <austinnichols@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Data management problem
Date   Tue, 6 May 2008 15:37:41 -0400

David W. Harless:
Assuming you have an identifier variable, you can just -append- the
files and sort by id and begin date and create new records for any
gaps, unless I am misunderstanding the problem.  If you have
overlapping periods in program 1 and 2, I suggest you make new
variables p1 and p2 and then make new records for periods where p1==1
and p2==1.  You might also prefer

         beg         end   program
   01 Jul 02   30 Nov 02         2
   30 Nov 02   01 Jul 03         0
   01 Jul 03   30 Sep 03         1

to

         beg         end   program
   01 Jul 02   30 Nov 02         2
   01 Dec 02   30 Jun 03         0
   01 Jul 03   30 Sep 03         1

depending on what you plan to do with the file...
On Tue, May 6, 2008 at 3:19 PM, n j cox <n.j.cox@durham.ac.uk> wrote:
> No suggestions from me, just a question.
>
> How you do know that the examples refer to the same individual?
> Are there identifier variable(s) too?
>
> Nick
> n.j.cox@durham.ac.uk
>
> David W. Harless
>
>
>
> I have two data sets containing dates of participation in two related
> programs, program 1
> and program 2.  But the complication is that the program 1 data set lists
> dates if the
> participant is enrolled in *either* program 1 or program 2.  Dates in the
> program 2 data
> set indicate definite participation in program 2.
>
> The best explanation is an example.  Here is program participation dates
> from the program
> 1 data set.  (Date variables have display format %dD_m_Y and I added the
> program variable
> to make this explanation clearer):
>
>          beg1        end1   program
>     01 Jul 02   30 Nov 02         1
>     01 Jul 03   30 Jun 05         1
>
> And the same individual for the program 2 data set:
>
>          beg2        end2   program
>     01 Jul 02   30 Nov 02         2
>     01 Oct 03   31 Dec 04         2
>     01 May 05   31 May 05         2
>
>
> I want to combine these records to obtain a data set that looks like:
>
>           beg         end   program
>     01 Jul 02   30 Nov 02         2
>     01 Dec 02   30 Jun 03         0
>     01 Jul 03   30 Sep 03         1
>     01 Oct 03   31 Dec 04         2
>     01 Jan 05   30 Apr 05         1
>     01 May 05   31 May 05         2
>     01 Jun 05   30 Jun 05         1
>
> (where the 0 indicates the individual did not participate in either program
> during that
> period).
>
> There are, of course, many individuals with varying dates of participation
> in one or both
> programs.  Any suggestions as to how one might solve this problem?
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index