Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re(2): st: Generating time-varying covariates in multiple spell data


From   "Pyy-Martikainen Marjo" <Marjo.Pyy-Martikainen@stat.fi>
To   statalist@hsphsun2.harvard.edu
Subject   Re(2): st: Generating time-varying covariates in multiple spell data
Date   9 Apr 2008 14:49:00 +0300

Hi Bill,

thanks for your beautiful solution -it works perfectly.

A small remark. Shouldn't the lines (3) and (4) be replaced by
the following lines:

. stsplit bot, at(11 12 23 24 35 36 47 48) (3)
. gen dummy=(mod(_t,12)==0)                (4)

because isn't it the interval (11,12] that contains
December and not the interval (12,13]?

-Marjo




William Gould,StataCorp LP  (8.4.2008  19:58):
>Marjo Pyy-Martikainen <Marjo.Pyy-Martikainen@stat.fi> writes,
>
>> I have a data containing multiple spells per person. The spells are measured
>> in months.  The data is in the following form:
>>
>>       PERSON    BEGIN   END     EVENT     DUR
>>            1   (  0      1 ]        1       1
>>            1   (  4     13 ]        1       9
>>            2   ( 15      5 ]        0      10
>>
>> where variables BEGIN and END are measured in calendar time (1 refers to Jan
>> 1995, 2 to Feb 1995 and so on until 60, Dec 1999).
>>
>> I stset the data in the following way:
>>
>>   . stset end, failure(event) time0(begin) exit(time .) origin(time begin)
>>
>> which means I want to "set the clock to zero" at the start of each spell.
>> Now I would like to include a dummy for December months 12, 24, 36 and 48.
>> It is thus a time-varying variable getting value 1 for the December months
>> and 0 for other months. A spell may include zero, one or many December
>> months.  I suppose I should use stsplit and do some kind of episode
>> splitting, but could someone help me and give me advice how I should do it
>> with my data?
>
>I have the solution.  Before starting, let's look and see what Marjo Marjo has
>already done.  At first I thought Marjo had made a mistake, but I was srong.
>The -stset- command is just complicated enough that theoretical examination
>does not work well; you can check that you have the intended result by listing
>the _t0, _t1, and _d variables that -stset- creates.  So I entered the data
>and typed the -stset- command.  Then I typed
>
>        . list _t0 _t _d
>
>             +---------------+
>             | _t0   _t   _d |
>             |---------------|
>          1. |   0    1    1 |
>          2. |   0    9    1 |
>          3. |   0   10    0 |
>             +---------------+
>
>Analysis time ranges over (0,1] and again over (0,9] for the first person.
>That's what I thought would happen, and it looks like an error, but notice
>that Marjo said, "which means I want to set the clock to zero at the start of
>each spell".  Okay, the command works exactly as Marjo said it would.
>
>Marjo now wants to add a dummy variable equal to 1 every December.
>Without explanation (that's coming), here's the solution:
>
>        . gen recid = _n                                            (1)
>
>        . stset end, id(person) failure(event) enter(begin) ///     (2)
>                     exit(time .) time0(begin)
>        . stsplit bot, at(12 13  24 25  36 38  48 49)               (3)
>        . gen dummy = ( mod(bot,12)==0 & bot!=0 )                   (4)
>
>        . stset end, id(recid) failure(event) time0(begin) ///      (5)
>                     exit(time .) origin(time begin)
>
>I admit that the entire solution did not occur to me at the out.  In fact, I
>went back and added first line at the end, and modified the fifth.  Here is
>what did occur to me:  We will have to use -stsplit-.  -stplit- wants to split
>on analysis time, so we will have first to -stset- our data based on calendar
>time, then -stsplit- the data, and finally we can -stset- our data the way we
>really want it.  The preliminary -stset- would allow us to generate the
>dummy variable for December.
>
>So let me explain.
>Ignore line (1); remember, it didn't even occur to me until later.
>
>Line (2) was the first line I wrote.  It seemed the right way to -stset-
>the data based on calendar time.  I didn't get the command right the
>first time, but after typing (2), I listed the data, saw what was wrong,
>and eventually got (2) to work just as I wanted it.  (What was wrong is that I
>forgot exit(time .) because this data, it turned out, had to be treated as
>multiple-failure data at this step.  When I say listed the data, what I do is
>list _t0, _t, and _d, so I can the time variables and outcome that will be
>used in analysis.  Here's what the data looked like after (2):
>
>        . list person _t0 _t _d
>
>             +------------------------+
>             | person   _t0   _t   _d |
>             |------------------------|
>          1. |      1     0    1    1 |
>          2. |      1     4   13    1 |
>          3. |      2    15   25    0 |
>             +------------------------+
>
>Pefect; _t0 and _t correspond to the original month variables.
>Now we can -stsplit-.  We need to set the dummy to 1 for months 12, 24, 36,
>and 48, which means we need to set it back to 0 for months 13, 25, 37, and
>49.  So I -stsplit- the data as 12, 13, 24, 25, 36, 37, 48, and 49 and
>created the dummy variable.  I checked results after executing commands (3)
>and (4):
>
>        . list person dummy _t0 _t _d
>
>             +--------------------------------+
>             | person   dummy   _t0   _t   _d |
>             |--------------------------------|
>          1. |      1       0     0    1    1 |
>          2. |      1       0     4   12    0 |
>          3. |      1       1    12   13    1 |
>          4. |      2       0    15   24    0 |
>          5. |      2       1    24   25    0 |
>             +--------------------------------+
>
>Actually, I check results after command (3), and I created the dummy
>more inefficiently (using two commands) on my first take, but that's
>irrelevant.  We have what we want in terms of how the data are split.
>Now we need to reset analysis time to be as we really want it.  So first,
>I just typed the original -stset- command Marjo supplied,
>
>>   . stset end, failure(event) time0(begin) exit(time .) origin(time begin)
>
>I listed the data, but that didn't work.  What I found was that
>the original second record, calendar time (4,13] and desired analysis time
>(0,9] was now itself split into two parts, and analysis time got reset
>on the second part.  Well, of course.  Marjo was treating this data as
>single-record survival data, but after the -stsplit-, what was single record
>data was no longer.  So I went back and added command (1), and then
>I could set what were (but are no longer) single records by specifying
>id(recnum).  That worked.  Here was the final result:
>
>        . list person dummy _t0 _t _d
>
>             +--------------------------------+
>             | person   dummy   _t0   _t   _d |
>             |--------------------------------|
>          1. |      1       0     0    1    1 |
>          2. |      1       0     4   12    0 |
>          3. |      1       1    12   13    1 |
>          4. |      2       0    15   24    0 |
>          5. |      2       1    24   25    0 |
>             +--------------------------------+
>
>I think that's what Marjo wants.
>
>I admit that this was a conceptually difficult problem, so let me emphasize
>two things:  First, to achieve a desired result, you can -stset- the data one
>way, and then later -stset- the data differently for analysis.  That was the
>insight that had not occurred to Marjo.  It is a trick worth remembering
>whenever working with data where you want some variables defined on one
>time scale (say months) and others on another (say analysis time).
>-stset- based on calendar months, create what you want, and then -stset-
>the data the way you really want it.
>
>The rest was just work.  I admit that I seldom get an -stset- command
>right the first time.  My technique is to guess and list.  Looking at
>the result, I go back and improve my guess, and eventually I get it
>right.
>
>-- Bill
>wgould@stata.com
>*
>*   For searches and help try:
>*   http://www.stata.com/support/faqs/res/findit.html
>*   http://www.stata.com/support/statalist/faq
>*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index