[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Yaseen Ghulam" <Yaseen.Ghulam@port.ac.uk> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Survival Analysis Issue |

Date |
Tue, 11 Nov 2003 10:17:20 -0000 |

Dear Stephen and other stata specialists, First of all sorry for double posting and thank you very much for your help. Your notes on survival analysis are great and very helpful. As we understand from your reply, there are two options of dealing with left censoring in our case and do you agree with us. 1. Assume that the hazard does not vary with time and drop the time variable and see what happens. 2. Drop those workers who joined before April 1996. Further, once model is estimated through discrete time method, what is the way where one can check that the in-sample predictions model has made at individual level are correct through predicted survival or hazard. Shabbar Jaffry Yaseen Ghulam st: Survival Analysis Issue > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [<mailto:owner-statalist@hsphsun2.harvard.edu>] On Behalf Of > Yaseen Ghulam > Sent: 05 November 2003 11:24 > To: statalist@hsphsun2.harvard.edu > Subject: st: Survival Analysis Issue > > > Dear Stata users, > > Currently we are working on a study which deals with workers > behaviour > in term of leaving the organisation pre maturely before their > contract > expires. Particularly, idea is to find who is likely to quit > and when by > using the past data. We will appreciate if someone can > provide some help. > > The data we have is a typical organisational data. Let me > briefly explain > what data set we have. > > In our administrative data set we have persons-month data > with monthly > observations starting from April 1996 till July 2002 (75 > monthly spells - > time) for approx.73 thousand workers (3.39m cases) implying > that these > workers came to observation from April 1996 and stayed under > observation till July 2002. Out of these 73 thousand workers > during the > observation period roughly 20 thousand quit the organisation > prematurely (20 thousand fail cases). Remaining are right censored. > > In the dataset we also have individuals who joined before 1996 > (observation window). However, we do not have information on those > who joined before 1996 and left before 1996 (left censoring). > > Those who joined after 1996 and either stayed or left > (delayed entry) before the end of > observation period (July 2002) we have a complete data set > about them. > ... snip ... > > Our questions are: > > 1. Can STATA deal with both cases of left and right censoring > and left truncation (delayed entry) simultaneously? > 2. Should we be only using those workers who joined after Apr > 1996 and > throw away those cases who joined before 1996 (due to left > censoring). You have interval-censored (banded) survival time data, a.k.a. discrete time data. for which it is no problem at all to handle left-truncated data combined with right censoring. [Have a look at the lecture notes and Stata lessons at http://www.iser.essex.ac.uk/teaching/stephenj/ec968/index.php] Left-censored data is more problematic. It's straighforward to handle if you are prepared to assume that the hazard rate does not vary with survival time. That's a strong, probably unacceptable, assumption -- but you might want to see what happens. Otherwise the standard way of handling the left-censoring is to drop those spells. > 3. We would like to predict which worker is likely to leave > and when. It > means calculating probability of failure and expected time of > failure for > next few years for right censored workers on the basis of > observation > period data (April 1996 to July 2002). If right censored > cases are many, does it effect > the quality of predictions. I suppose these predictions should > be limited to only next 6 years as our observation span is > only for 6 > years. > Have anybody written any macros or programmes in Stata to > carry out these predictions > by considering the above mentioned issues and type of data we > have using survival > analysis framework? If you look at the lessons on discrete time models cited above, you'll see examples of Stata code showing how to do within-sample and out-of-sample predictions of the sort that you are asking about. Stephen ------------------------------------------------------------- Professor Stephen P. Jenkins <stephenj@essex.ac.uk> Institute for Social and Economic Research University of Essex, Colchester CO4 3SQ, U.K. Tel: +44 1206 873374. Fax: +44 1206 873151. <http://www.iser.essex.ac.uk> * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: RE: Re: Kolmogorov-Smirnov problem** - Next by Date:
**Re: st: graph doesn't work** - Previous by thread:
**Re: st: Survival Analysis Issue** - Next by thread:
**st: translator print; letter pagesize versus din A4; inches versus centimeters** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |