Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Survival Analysis Issue

From   "Yaseen Ghulam" <>
Subject   st: Survival Analysis Issue
Date   Wed, 05 Nov 2003 11:23:40 -0000

Dear Stata users,   

Currently we are working on a study which deals with workers behaviour  
in term of leaving the organisation pre maturely before their contract  
expires. Particularly, idea is to find who is likely to quit and when by  
using the past data.  We will appreciate if someone can provide some help.

The data we have is a typical organisational data. Let me briefly explain  
what data set we have.  

In our administrative data set we have persons-month data with monthly  
observations starting from April 1996 till July 2002 (75 monthly spells -  
time) for approx.73 thousand workers (3.39m cases) implying that these  
workers came to observation from April 1996 and stayed under  
observation till July 2002. Out of these 73 thousand workers during the  
observation period roughly 20 thousand quit the organisation  
prematurely (20 thousand fail cases). Remaining are right censored.   

In the dataset we also have individuals who joined before 1996  
(observation window). However, we do not have information on those  
who joined before 1996 and left before 1996 (left censoring).  

Those who joined after 1996 and either stayed or left (delayed entry) before the end of  
observation period (July 2002) we have a complete data set about them. 

Our data set has the normal job related variables (e.g. what  
job they are performing etc.) and demographic variables (e.g. gender,  
marital status etc). We have introduced external factors (e.g. no of  
vacancies, claimant counts, manufacturing productivity index, inflation  
rate, manufacturing sector earning index etc.) into the data set. These  
time varying covariates have been merged with the above data set by  
calendar months (time).  

Our questions are:  

1. Can STATA deal with both cases of left and right censoring  
and left truncation (delayed entry) simultaneously?  

2. Should we be only using those workers who joined after Apr 1996 and  
throw away those cases who joined before 1996 (due to left censoring). 
3. We would like to predict which worker is likely to leave and when. It  
means calculating probability of failure and expected time of failure for  
next few years for right censored workers on the basis of observation  
period data (April 1996 to July 2002). If right censored cases are many, does it effect 
the quality of predictions. I suppose these predictions should  
be limited to only next 6 years as our observation span is only for 6  

Have anybody written any macros or programmes in Stata to carry out these predictions 
by considering the above mentioned issues and type of data we have using survival 
analysis framework? 

We highly appreciate the help.      
Shabbar Jaffry  
Yaseen Ghulam  
University of Portsmouth 

*   For searches and help try:

© Copyright 1996–2023 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index