Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Stephen P. Jenkins" <stephenj@essex.ac.uk> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: Discrete time hazard model - Interval width |
Date | Thu, 20 May 2010 10:05:45 +0100 |
============================== Date: Wed, 19 May 2010 21:15:01 +0200 From: gideluca@unical.it Subject: Re: st: Discrete time hazard model - Interval width Dear Steve, Thank you for your help. The stock-sample includes all 2004 graduates at risk of finding a job (more precisely searching for a job and not pursuing other studies) who have been interviewed one year (2005), three years (2007) and five years (2009) after their graduation. So, we initially select only those who in the baseline interview 2004 are looking for a job. These individuals can remain in their unemployment state, find a permanent job or be lost to follow-up. We do not know exactly the date of the first real job they find, we just know if at the time of each subsequent interview they have a stable job or not. In particular, the relevant questions to identify failures are: "Are you working at this moment" and "Is your job stable?". Graduates finding a stable job, say, in the first interview are our failures. So, they are not at risk anymore and they are discarded from the analysis. In addition, for people who had never gotten a "real job", the last observed interview serves to define our censoring points. What's wrong with ignoring in a discrete time hazard model the fact that interviews are not administered regularly over time? ======================== The commonly-used ways of fitting discrete time hazard regression models are based on the assumption of equal-width intervals. In this case, one can show that the model likelihood is the same as the likelihood for a binary dependent variable model applied to expanded data in which there is one record (data row) for each interval that each person is at risk. The same approach applies when there are left-truncation (stock sampling): see Jenkins, Oxford Bulletin of Econ & Stats 1995. This correspondence, and hence the "easy estimation" method, breaks down when the intervals are not of equal width. In this case, one needs to more careful about the different length of times that each person is at risk of experiencing the event over the intervals of different width. -intcens- on SSC allows you to do this, at the cost of not allowing time-varying covariates. More generally, think of your data as "interval censored" rather than "discrete". Very few social science survival analysis processes, including yours, have survival times that are intrinsically discrete. Most refer to some underlying process in continuous time; the problem is that the times are recorded in grouped (banded) form -- they are "interval censored". (Models for "interval censoring" and "discrete" survival time data correspond exactly when there are intervals of equal-width because you can then count "time" consistently using a sequence of positive integers.) I note that your intervals are really rather wide (in addition to being of unequal width). Off the top of my head, I wonder whether another way to proceed might be to consider simply modelling the binary sequence for your samples For your sample of 2004 graduates looking for a job, model jointly ... Pr(got job by 2009 | no job by 2007) Pr(got job by 2007 | no job by 2005) Pr(got job by 2005) This could be modelled as a trivariate probit with 2 selections using the methods set out by Cappellari & Jenkins (Stata Journal, 6(2) 2006, downloadable from SJ website). Stephen ------------------------------------- Professor Stephen P. Jenkins <stephenj@essex.ac.uk> Institute for Social and Economic Research (ISER) University of Essex, Colchester CO4 3SQ, UK Tel: +44(0)1206 873374. Fax: +44(0)1206 873151 http://www.iser.essex.ac.uk Survival Analysis using Stata: http://www.iser.essex.ac.uk/survival-analysis Downloadable papers and software: http://ideas.repec.org/e/pje7.html * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/