Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Generating time-varying covariates in multiple spell data


From   wgould@stata.com (William Gould, StataCorp LP)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Generating time-varying covariates in multiple spell data
Date   Wed, 09 Apr 2008 09:53:50 -0500

Marjo Pyy-Martikainen <Marjo.Pyy-Martikainen@stat.fi> writes,

> thanks for your [...] solution -it works perfectly.

> A small remark. Shouldn't the lines (3) and (4) be replaced by
> the following lines:
> 
>        . stsplit bot, at(11 12 23 24 35 36 47 48) (3)
>        . gen dummy=(mod(_t,12)==0)                (4)
> 
> because isn't it the interval (11,12] that contains
> December and not the interval (12,13]?

That is, 

         . stsplit bot, at(12 13  24 25  36 38  48 49)  <- I wrote
         . stsplit bot, at(11 12  23 24  35 36  47 48)  <- Marjo suggests

The answer is that it depends on what 11, 12, and 13 mean, which is to say, 
how Marjo defined them, or how they were defined for him in the original data.

For Marjo to be right, think of continuous time, and define 11, 12, and 13 
as being the instant the month ends

                    |December|January |
                    |        |        | 
         -----------|--------|--------|-----------------> time
                   11       12       13

For me to be right, think of continuous time, and define 11, 12, and 13
to be the instant the month begins:


                    |November|December|
                    |        |        | 
         -----------|--------|--------|-----------------> time
                   11       12       13


When I wrote my answer, I just assumed that months in the data recorded 
start of month.

The fact probably is that 11, 12, and 13 are neither beginnings nor ends  
of months, but are entire months.  The original data was intended to be read,
"The value of x was 5 in December", etc.  That probably corresponds to 
months being recorded as starts of months. 

But Marjo may be correct.  There is no right answer and Marjo needs to look
back at how the months numbers were defined in the original data, and check
for any subsequent processing that might have changed that.

Marjo wants a dummy for December.  Thus, the dummy must be set to 1 
at the span that begins at the start of December and set back to 0 at the 
span that begins at the start of January.  Well, I'm being sloppy about 
how I just said that because in Stata, time spans are (t0,t1], but you 
know what I mean.  The pictures say it all; don't get hung up on the 
open/closed end points, think instead in terms of the span.

So the answer is either 


                    |December|January |
                    |        |        | 
         -----------|--------|--------|-----------------> time
                   11       12       13
                    (--------]--------------------
                        |        |
                  set dummy=1    |
                on this record   |
                                 |
                            set dummy=0
                           on this record

or



                    |November|December|January |
                    |        |        |        |
         -----------|--------|--------|--------|--------> time
                   11       12       13       14
                             (--------]--------------------
                                 |        |
                           set dummy=1    |
                         on this record   |
                                          |
                                     set dummy=0
                                    on this record

-- Bill
   wgould@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index