Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Discrete time hazard model - Interval width

From	"Stephen P. Jenkins" <[email protected]>
To	<[email protected]>
Subject	Re: st: Discrete time hazard model - Interval width
Date	Thu, 20 May 2010 10:05:45 +0100

==============================
Date: Wed, 19 May 2010 21:15:01 +0200
From: [email protected]
Subject: Re: st: Discrete time hazard model - Interval width

Dear Steve,
Thank you for your help.
The stock-sample includes all 2004 graduates at risk of finding a
job  
(more precisely searching for a job and not pursuing other
studies)  
who have been interviewed one year (2005), three years (2007) and

five years (2009) after their  graduation. So, we initially
select  
only those who in the baseline interview 2004 are looking for a
job.  
These individuals can remain in their unemployment state, find a

permanent job or be lost to follow-up.

We do not know exactly the date of the first real job they find,
we  
just know if at the time of each subsequent interview they have a

stable job or not. In particular, the relevant questions to
identify  
failures are: "Are you working at this moment" and "Is your job  
stable?". Graduates finding a stable job, say, in the first
interview  
are our failures. So, they are not at risk anymore and they are  
discarded from the analysis. In addition, for people who had
never  
gotten a "real job",  the  last observed interview serves to
define  
our censoring points.

What's wrong with ignoring in a discrete time hazard model the
fact  
that interviews are not administered regularly over time?
========================

The commonly-used ways of fitting discrete time hazard regression
models are based on the assumption of equal-width intervals. In
this case, one can show that the model likelihood is the same as
the likelihood for a binary dependent variable model applied to
expanded data in which there is one record (data row) for each
interval that each person is at risk. The same approach applies
when there are left-truncation (stock sampling): see Jenkins,
Oxford Bulletin of Econ & Stats 1995.

This correspondence, and hence the "easy estimation" method,
breaks down when the intervals are not of equal width. In this
case, one needs to more careful about the different length of
times that each person is at risk of experiencing the event over
the intervals of different width. -intcens- on SSC allows you to
do this, at the cost of not allowing time-varying covariates. 

More generally, think of your data as "interval censored" rather
than "discrete". Very few social science survival analysis
processes, including yours, have survival times that are
intrinsically discrete. Most refer to some underlying process in
continuous time; the problem is that the times are recorded in
grouped (banded) form -- they are "interval censored".  (Models
for "interval censoring" and "discrete" survival time data
correspond exactly when there are intervals of equal-width
because you can then count "time" consistently using a sequence
of positive integers.)

I note that your intervals are really rather wide (in addition to
being of unequal width).  Off the top of my head, I wonder
whether another way to proceed might be to consider simply
modelling the binary sequence for your samples

For your sample of 2004 graduates looking for a job, model
jointly ...
	Pr(got job by 2009 | no job by 2007)
	Pr(got job by 2007 | no job by 2005)
	Pr(got job by 2005)

This could be modelled as a trivariate probit with 2 selections
using the methods set out by Cappellari & Jenkins (Stata Journal,
6(2) 2006, downloadable from SJ website).

Stephen
-------------------------------------
Professor Stephen P. Jenkins <[email protected]>
Institute for Social and Economic Research (ISER)
University of Essex, Colchester CO4 3SQ, UK
Tel: +44(0)1206 873374. Fax: +44(0)1206 873151
http://www.iser.essex.ac.uk 
Survival Analysis using Stata:
http://www.iser.essex.ac.uk/survival-analysis  
Downloadable papers and software:
http://ideas.repec.org/e/pje7.html 

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Discrete time hazard model - Interval width
  - From: Steve Samuels <[email protected]>

Prev by Date: AW: st: RE: Potential bug in Stata 11
Next by Date: st: Stata 11: xstata via ssh with X forwarding - some aspects are sluggish
Previous by thread: st: Discrete time hazard model - Interval width
Next by thread: Re: st: Discrete time hazard model - Interval width
Index(es):
- Date
- Thread