Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: stset survival analysis with right censoring and left truncation for a bankruptcy dataset

From   Steve Samuels <>
Subject   Re: st: stset survival analysis with right censoring and left truncation for a bankruptcy dataset
Date   Sun, 29 Dec 2013 11:25:16 -0500

Your analysis is attempting to predict firm bankruptcy from a data set
of workers. This won't work. For a bankrupt firm with 500 workers, for
example,the -stset- statement would tell Stata that there were 500
failures. You must create a data set with one observation per firm per
year and create variables that summarize worker characteristics in the
prior year; one variable might be proportion female. I say "prior
year", because an observed association in current year data might
arise because a bankruptcy led to changes in the worker mix.

Also: Stata's -st- commands require failures in a time dimension with
enough values that it can be considered continuous. With only yearly
data, you must use a discrete analysis. I recommend either a logit or
complementary log-log model (-cloglog-). For more details, see the the
Lesson 6 link to discrete data analysis on Stephen Jenkins's fine web
page "Survival analysis with Stata"

Steve Samuels

18 Cantine's Island
Saugerties NY 12477 USA

> On Dec 28, 2013, at 9:02 AM, <> <> wrote:
> Dear all,
> I have a panel dataset over 5 years with the following variables: pers_id, employee, company, entry, exit, firm's bankruptcy_year, status, outcome. Where:
> pers_id=identification number of employee i in company j.
> employee=identification number of employee i.
> company=identification number of company j.
> entry/exit: start and end of the labor contract for employee i.
> firm's bankruptcy_year: no explanation needed
> status: 0 when the company survived, 1 when became insolvent
> outcome: 1 if failed, 0 the observation was not observed to fail.
> Severall data are right censored and or left truncated.
> There are much more variables in the dataset, i.e. age, sex, function.... 
> I want to run a survival analysis to observe the employee's influence on a bankruptcy. Some people are still employed at the time of the bankruptcy but some people have left the company before the bankruptcy. I think that the former people are well considered with command:
> stset end, id(pers_id) time0(start) origin(time start) failure(outcome==1)
> but with that command the latter group of people aren't correct involved. Because the may have influenced the bankruptcy too. 
> I hope that someone could advice me how to involve the latter group correct in my stset.
> Regards,
> Andri

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index