Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: grouped duration-discrete time survival analysis-WAS stset...


From   "Stephen P. Jenkins" <stephenj@essex.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: grouped duration-discrete time survival analysis-WAS stset...
Date   Mon, 2 Dec 2002 12:45:00 +0000 (GMT Standard Time)

On Mon, 2 Dec 2002 04:24:09 -0800 (PST) Enrica Croda 
<croda@nicco.sscnet.ucla.edu> wrote:

> On Mon, 2 Dec 2002, Stephen P. Jenkins wrote:
> 
> > On Sun, 1 Dec 2002 03:07:48 -0800 (PST) Enrica Croda
> > <croda@nicco.sscnet.ucla.edu> wrote:
> >
> > <snip>
> >
> > > So, to recap, I now believe my data are grouped duration data...
> > > I understand that in this case I need to organize my data the so-called
> > > "person-period" form.
> > > I would appreciate getting feedback on the following:
> > > My data are already organized by ID and year in "long" panel data
> > > form (iis ID, tis year) with year = 1984, 1985,...1998.
> > > A. Do I need to -expand- the data set?
> > > I am thinking I just need to generate the analysis time
> > > variable, with something like:
> > > (A1)	by ID: generate TIME = _n;
> > > please see also question B, below.
> > > B. How do I deal with delayed entry?
> > > Assuming people first become at risk of not living independently at age 65,
> > > which may not be the age at which they are first observed in my data,
> > > how do I incorporate this information in my analysis?
> >
> 
> > Suppose first that there is no delayed entry -- in which case you would
> > need a row in the data set corresponding to each year that each person
> > was /at risk of experiencing the event of interest/. If you were to
> > assume the first year at risk corresponds to age 65, you need rows for
> > each person for each year corresponding to age 65+. As the first survey
> > year (1984 in GSOEP) is after age 65 for most persons, then you
> > would need to create new rows in the data corresponding to those ages
> > before the beginning of the survey. The TIME variable starts with 1 for
> > age 65, then 2 for age 66, and so on. [You would also need to 'spread'
> > values for explanatory variables back onto these new person-year obs.]
> > -expand- could probably be used to create the required data structure,
> > making using of the -if- qualifier to ensure that the correct number of
> > new person-year observations gets generated for each person. (As the
> > respondents were of different ages in 1984, the number of new data rows
> > will differ from person to person.)
> >
> 
> Ideally, I would like to use some time-varying variables (e.g. income)
> in the analysis. What would be the appropriate thing to do for these
> variables when I 'spread' them?
 
You would have to create the appropriate values. Of course the fact 
that those new person-year observations are before the start of the 
panel may constrain what you are able to create.  But in fact if you 
make the delayed-entry 'correction' as discussed then the TVCs for 
pre-panel years are not needed.
 
> > Now, to control for the delayed entry aspect and get the likelihood
> > correct, all you need do is create the data structure as just stated,
> > but throw away the person-years corresponding to pre-1984 (first survey
> > year). (Note that the duration counter TIME does not start from 1 in
> > most cases in the delayed-entry version of the data set.)
> 
> I am afraid I am still missing something. Please forgive me if this is a
> silly question. If I understand correctly, the only variable I really
> need is the appropriate 'analysis time' counter. I will throw away all the
> records generated through -expand-. Correct?

I was attempting to discuss general principles rather than special 
cases, hoping to help understanding. It appears (from a brief glance) 
that, given that you already have person-year data for the period 
covered by the panel, you will not have to -expand-, and your code 
achieves what is required to generate the correct duration counter.
 
Stephen
----------------------
Professor Stephen P. Jenkins <stephenj@essex.ac.uk>
Institute for Social and Economic Research (ISER)
University of Essex, Colchester, CO4 3SQ, UK
Tel: +44 (0)1206 873374. Fax: +44 (0)1206 873151.
http://www.iser.essex.ac.uk

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index