Re: st: grouped duration-discrete time survival analysis-WAS stset...

Mon, 2 Dec 2002 12:45:00 +0000 (GMT Standard Time)

On Mon, 2 Dec 2002 04:24:09 -0800 (PST) Enrica Croda <croda@nicco.sscnet.ucla.edu> wrote: > On Mon, 2 Dec 2002, Stephen P. Jenkins wrote: > > > On Sun, 1 Dec 2002 03:07:48 -0800 (PST) Enrica Croda > > <croda@nicco.sscnet.ucla.edu> wrote: > > > > <snip> > > > > > So, to recap, I now believe my data are grouped duration data... > > > I understand that in this case I need to organize my data the so-called > > > "person-period" form. > > > I would appreciate getting feedback on the following: > > > My data are already organized by ID and year in "long" panel data > > > form (iis ID, tis year) with year = 1984, 1985,...1998. > > > A. Do I need to -expand- the data set? > > > I am thinking I just need to generate the analysis time > > > variable, with something like: > > > (A1) by ID: generate TIME = _n; > > > please see also question B, below. > > > B. How do I deal with delayed entry? > > > Assuming people first become at risk of not living independently at age 65, > > > which may not be the age at which they are first observed in my data, > > > how do I incorporate this information in my analysis? > > > > > Suppose first that there is no delayed entry -- in which case you would > > need a row in the data set corresponding to each year that each person > > was /at risk of experiencing the event of interest/. If you were to > > assume the first year at risk corresponds to age 65, you need rows for > > each person for each year corresponding to age 65+. As the first survey > > year (1984 in GSOEP) is after age 65 for most persons, then you > > would need to create new rows in the data corresponding to those ages > > before the beginning of the survey. The TIME variable starts with 1 for > > age 65, then 2 for age 66, and so on. [You would also need to 'spread' > > values for explanatory variables back onto these new person-year obs.] > > -expand- could probably be used to create the required data structure, > > making using of the -if- qualifier to ensure that the correct number of > > new person-year observations gets generated for each person. (As the > > respondents were of different ages in 1984, the number of new data rows > > will differ from person to person.) > > > > Ideally, I would like to use some time-varying variables (e.g. income) > in the analysis. What would be the appropriate thing to do for these > variables when I 'spread' them? You would have to create the appropriate values. Of course the fact that those new person-year observations are before the start of the panel may constrain what you are able to create. But in fact if you make the delayed-entry 'correction' as discussed then the TVCs for pre-panel years are not needed. > > Now, to control for the delayed entry aspect and get the likelihood > > correct, all you need do is create the data structure as just stated, > > but throw away the person-years corresponding to pre-1984 (first survey > > year). (Note that the duration counter TIME does not start from 1 in > > most cases in the delayed-entry version of the data set.) > > I am afraid I am still missing something. Please forgive me if this is a > silly question. If I understand correctly, the only variable I really > need is the appropriate 'analysis time' counter. I will throw away all the > records generated through -expand-. Correct? I was attempting to discuss general principles rather than special cases, hoping to help understanding. It appears (from a brief glance) that, given that you already have person-year data for the period covered by the panel, you will not have to -expand-, and your code achieves what is required to generate the correct duration counter. Stephen ---------------------- Professor Stephen P. Jenkins <stephenj@essex.ac.uk> Institute for Social and Economic Research (ISER) University of Essex, Colchester, CO4 3SQ, UK Tel: +44 (0)1206 873374. Fax: +44 (0)1206 873151. http://www.iser.essex.ac.uk * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

