[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Survival Analysis Issue
Thank you very much for your help. Your notes are very helpful.
As we understand from your reply, there are two options of dealing with
left censoring in our case.
1. Assume that the hazard does not vary with time and drop the time
variable and see what happens.
2. Drop those workers who joined before April 1996.
st: Survival Analysis Issue
> -----Original Message-----
> From: firstname.lastname@example.org
> [<mailto:email@example.com>] On Behalf Of
> Yaseen Ghulam
> Sent: 05 November 2003 11:24
> To: firstname.lastname@example.org
> Subject: st: Survival Analysis Issue
> Dear Stata users,
> Currently we are working on a study which deals with workers
> in term of leaving the organisation pre maturely before their
> expires. Particularly, idea is to find who is likely to quit
> and when by
> using the past data. We will appreciate if someone can
> provide some help.
> The data we have is a typical organisational data. Let me
> briefly explain
> what data set we have.
> In our administrative data set we have persons-month data
> with monthly
> observations starting from April 1996 till July 2002 (75
> monthly spells -
> time) for approx.73 thousand workers (3.39m cases) implying
> that these
> workers came to observation from April 1996 and stayed under
> observation till July 2002. Out of these 73 thousand workers
> during the
> observation period roughly 20 thousand quit the organisation
> prematurely (20 thousand fail cases). Remaining are right censored.
> In the dataset we also have individuals who joined before 1996
> (observation window). However, we do not have information on those
> who joined before 1996 and left before 1996 (left censoring).
> Those who joined after 1996 and either stayed or left
> (delayed entry) before the end of
> observation period (July 2002) we have a complete data set
> about them.
... snip ...
> Our questions are:
> 1. Can STATA deal with both cases of left and right censoring
> and left truncation (delayed entry) simultaneously?
> 2. Should we be only using those workers who joined after Apr
> 1996 and
> throw away those cases who joined before 1996 (due to left
You have interval-censored (banded) survival time data, a.k.a. discrete
for which it is no problem at all to handle left-truncated data combined
with right censoring.
[Have a look at the lecture notes and Stata lessons at
Left-censored data is more problematic. It's straighforward to handle if
you are prepared to assume that the hazard rate does not vary with
survival time. That's a strong, probably unacceptable, assumption --
you might want to see what happens.
Otherwise the standard way of handling the left-censoring is to drop
> 3. We would like to predict which worker is likely to leave
> and when. It
> means calculating probability of failure and expected time of
> failure for
> next few years for right censored workers on the basis of
> period data (April 1996 to July 2002). If right censored
> cases are many, does it effect
> the quality of predictions. I suppose these predictions should
> be limited to only next 6 years as our observation span is
> only for 6
> Have anybody written any macros or programmes in Stata to
> carry out these predictions
> by considering the above mentioned issues and type of data we
> have using survival
> analysis framework?
If you look at the lessons on discrete time models cited above, you'll
see examples of Stata code showing how to do within-sample and
out-of-sample predictions of the sort that you are asking about.
Professor Stephen P. Jenkins <email@example.com>
Institute for Social and Economic Research
University of Essex, Colchester CO4 3SQ, U.K.
Tel: +44 1206 873374. Fax: +44 1206 873151.
Lecturer Banking and Finance
University of Portsmouth
Southsea PO4 8JF
Ph: +44 23 9284 4127
* For searches and help try: