[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Pyy-Martikainen Marjo" <Marjo.Pyy-Martikainen@stat.fi> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re(2): st: Generating time-varying covariates in multiple spell data |

Date |
9 Apr 2008 14:49:00 +0300 |

Hi Bill, thanks for your beautiful solution -it works perfectly. A small remark. Shouldn't the lines (3) and (4) be replaced by the following lines: . stsplit bot, at(11 12 23 24 35 36 47 48) (3) . gen dummy=(mod(_t,12)==0) (4) because isn't it the interval (11,12] that contains December and not the interval (12,13]? -Marjo William Gould,StataCorp LP (8.4.2008 19:58): >Marjo Pyy-Martikainen <Marjo.Pyy-Martikainen@stat.fi> writes, > >> I have a data containing multiple spells per person. The spells are measured >> in months. The data is in the following form: >> >> PERSON BEGIN END EVENT DUR >> 1 ( 0 1 ] 1 1 >> 1 ( 4 13 ] 1 9 >> 2 ( 15 5 ] 0 10 >> >> where variables BEGIN and END are measured in calendar time (1 refers to Jan >> 1995, 2 to Feb 1995 and so on until 60, Dec 1999). >> >> I stset the data in the following way: >> >> . stset end, failure(event) time0(begin) exit(time .) origin(time begin) >> >> which means I want to "set the clock to zero" at the start of each spell. >> Now I would like to include a dummy for December months 12, 24, 36 and 48. >> It is thus a time-varying variable getting value 1 for the December months >> and 0 for other months. A spell may include zero, one or many December >> months. I suppose I should use stsplit and do some kind of episode >> splitting, but could someone help me and give me advice how I should do it >> with my data? > >I have the solution. Before starting, let's look and see what Marjo Marjo has >already done. At first I thought Marjo had made a mistake, but I was srong. >The -stset- command is just complicated enough that theoretical examination >does not work well; you can check that you have the intended result by listing >the _t0, _t1, and _d variables that -stset- creates. So I entered the data >and typed the -stset- command. Then I typed > > . list _t0 _t _d > > +---------------+ > | _t0 _t _d | > |---------------| > 1. | 0 1 1 | > 2. | 0 9 1 | > 3. | 0 10 0 | > +---------------+ > >Analysis time ranges over (0,1] and again over (0,9] for the first person. >That's what I thought would happen, and it looks like an error, but notice >that Marjo said, "which means I want to set the clock to zero at the start of >each spell". Okay, the command works exactly as Marjo said it would. > >Marjo now wants to add a dummy variable equal to 1 every December. >Without explanation (that's coming), here's the solution: > > . gen recid = _n (1) > > . stset end, id(person) failure(event) enter(begin) /// (2) > exit(time .) time0(begin) > . stsplit bot, at(12 13 24 25 36 38 48 49) (3) > . gen dummy = ( mod(bot,12)==0 & bot!=0 ) (4) > > . stset end, id(recid) failure(event) time0(begin) /// (5) > exit(time .) origin(time begin) > >I admit that the entire solution did not occur to me at the out. In fact, I >went back and added first line at the end, and modified the fifth. Here is >what did occur to me: We will have to use -stsplit-. -stplit- wants to split >on analysis time, so we will have first to -stset- our data based on calendar >time, then -stsplit- the data, and finally we can -stset- our data the way we >really want it. The preliminary -stset- would allow us to generate the >dummy variable for December. > >So let me explain. >Ignore line (1); remember, it didn't even occur to me until later. > >Line (2) was the first line I wrote. It seemed the right way to -stset- >the data based on calendar time. I didn't get the command right the >first time, but after typing (2), I listed the data, saw what was wrong, >and eventually got (2) to work just as I wanted it. (What was wrong is that I >forgot exit(time .) because this data, it turned out, had to be treated as >multiple-failure data at this step. When I say listed the data, what I do is >list _t0, _t, and _d, so I can the time variables and outcome that will be >used in analysis. Here's what the data looked like after (2): > > . list person _t0 _t _d > > +------------------------+ > | person _t0 _t _d | > |------------------------| > 1. | 1 0 1 1 | > 2. | 1 4 13 1 | > 3. | 2 15 25 0 | > +------------------------+ > >Pefect; _t0 and _t correspond to the original month variables. >Now we can -stsplit-. We need to set the dummy to 1 for months 12, 24, 36, >and 48, which means we need to set it back to 0 for months 13, 25, 37, and >49. So I -stsplit- the data as 12, 13, 24, 25, 36, 37, 48, and 49 and >created the dummy variable. I checked results after executing commands (3) >and (4): > > . list person dummy _t0 _t _d > > +--------------------------------+ > | person dummy _t0 _t _d | > |--------------------------------| > 1. | 1 0 0 1 1 | > 2. | 1 0 4 12 0 | > 3. | 1 1 12 13 1 | > 4. | 2 0 15 24 0 | > 5. | 2 1 24 25 0 | > +--------------------------------+ > >Actually, I check results after command (3), and I created the dummy >more inefficiently (using two commands) on my first take, but that's >irrelevant. We have what we want in terms of how the data are split. >Now we need to reset analysis time to be as we really want it. So first, >I just typed the original -stset- command Marjo supplied, > >> . stset end, failure(event) time0(begin) exit(time .) origin(time begin) > >I listed the data, but that didn't work. What I found was that >the original second record, calendar time (4,13] and desired analysis time >(0,9] was now itself split into two parts, and analysis time got reset >on the second part. Well, of course. Marjo was treating this data as >single-record survival data, but after the -stsplit-, what was single record >data was no longer. So I went back and added command (1), and then >I could set what were (but are no longer) single records by specifying >id(recnum). That worked. Here was the final result: > > . list person dummy _t0 _t _d > > +--------------------------------+ > | person dummy _t0 _t _d | > |--------------------------------| > 1. | 1 0 0 1 1 | > 2. | 1 0 4 12 0 | > 3. | 1 1 12 13 1 | > 4. | 2 0 15 24 0 | > 5. | 2 1 24 25 0 | > +--------------------------------+ > >I think that's what Marjo wants. > >I admit that this was a conceptually difficult problem, so let me emphasize >two things: First, to achieve a desired result, you can -stset- the data one >way, and then later -stset- the data differently for analysis. That was the >insight that had not occurred to Marjo. It is a trick worth remembering >whenever working with data where you want some variables defined on one >time scale (say months) and others on another (say analysis time). >-stset- based on calendar months, create what you want, and then -stset- >the data the way you really want it. > >The rest was just work. I admit that I seldom get an -stset- command >right the first time. My technique is to guess and list. Looking at >the result, I go back and improve my guess, and eventually I get it >right. > >-- Bill >wgould@stata.com >* >* For searches and help try: >* http://www.stata.com/support/faqs/res/findit.html >* http://www.stata.com/support/statalist/faq >* http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: st: Generating time-varying covariates in a multiple spell data***From:*wgould@stata.com (William Gould, StataCorp LP)

- Prev by Date:
**st: tweaking Stata behaviour to my needs?** - Next by Date:
**st: adding info to bar graph** - Previous by thread:
**Re: st: Generating time-varying covariates in a multiple spell data** - Next by thread:
**st: SAS/Stata data conversion programs are update on SSC** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |