[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: RE: Survival Analysis Issue
> -----Original Message-----
> From: firstname.lastname@example.org
> [mailto:email@example.com] On Behalf Of
> Yaseen Ghulam
> Sent: 05 November 2003 11:24
> To: firstname.lastname@example.org
> Subject: st: Survival Analysis Issue
> Dear Stata users,
> Currently we are working on a study which deals with workers
> in term of leaving the organisation pre maturely before their
> expires. Particularly, idea is to find who is likely to quit
> and when by
> using the past data. We will appreciate if someone can
> provide some help.
> The data we have is a typical organisational data. Let me
> briefly explain
> what data set we have.
> In our administrative data set we have persons-month data
> with monthly
> observations starting from April 1996 till July 2002 (75
> monthly spells -
> time) for approx.73 thousand workers (3.39m cases) implying
> that these
> workers came to observation from April 1996 and stayed under
> observation till July 2002. Out of these 73 thousand workers
> during the
> observation period roughly 20 thousand quit the organisation
> prematurely (20 thousand fail cases). Remaining are right censored.
> In the dataset we also have individuals who joined before 1996
> (observation window). However, we do not have information on those
> who joined before 1996 and left before 1996 (left censoring).
> Those who joined after 1996 and either stayed or left
> (delayed entry) before the end of
> observation period (July 2002) we have a complete data set
> about them.
... snip ...
> Our questions are:
> 1. Can STATA deal with both cases of left and right censoring
> and left truncation (delayed entry) simultaneously?
> 2. Should we be only using those workers who joined after Apr
> 1996 and
> throw away those cases who joined before 1996 (due to left
You have interval-censored (banded) survival time data, a.k.a. discrete
for which it is no problem at all to handle left-truncated data combined
with right censoring.
[Have a look at the lecture notes and Stata lessons at
Left-censored data is more problematic. It's straighforward to handle if
you are prepared to assume that the hazard rate does not vary with
survival time. That's a strong, probably unacceptable, assumption -- but
you might want to see what happens.
Otherwise the standard way of handling the left-censoring is to drop
> 3. We would like to predict which worker is likely to leave
> and when. It
> means calculating probability of failure and expected time of
> failure for
> next few years for right censored workers on the basis of
> period data (April 1996 to July 2002). If right censored
> cases are many, does it effect
> the quality of predictions. I suppose these predictions should
> be limited to only next 6 years as our observation span is
> only for 6
> Have anybody written any macros or programmes in Stata to
> carry out these predictions
> by considering the above mentioned issues and type of data we
> have using survival
> analysis framework?
If you look at the lessons on discrete time models cited above, you'll
see examples of Stata code showing how to do within-sample and
out-of-sample predictions of the sort that you are asking about.
Professor Stephen P. Jenkins <email@example.com>
Institute for Social and Economic Research
University of Essex, Colchester CO4 3SQ, U.K.
Tel: +44 1206 873374. Fax: +44 1206 873151.
* For searches and help try: