Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

Re: st: survival analysis with unknown censoring

 From Steve Samuels To statalist@hsphsun2.harvard.edu Subject Re: st: survival analysis with unknown censoring Date Sat, 11 Sep 2010 18:08:23 -0400

```--

Stefan:

I don't believe there is much that you can infer. Suppose you have two
groups to compare. Consider the following two scenarios:

Scenario *
In group 1, all B events and censoring take place long before the
first A event. In group 2, all B events and censoring take place after
the last A event. In this scenario, with complete data, the estimated
A hazard rates would be higher in group 1 than in group 2.

Scenario **
The censoring and B event patterns in the two groups are reversed.
With complete data, the estimated A hazard rates in group 2 would be
higher than those in group 1..

The two scenarios are extreme and unrealistic, but your _data_ cannot
distinguish between them.

Steve

Steven J. Samuels
sjsamuels@gmail.com
18 Cantine's Island
Saugerties NY 12477
USA
Voice: 845-246-0774
Fax:    206-202-4783

On Thu, Sep 9, 2010 at 6:55 AM, Wagner, Stefan <swagner@bwl.lmu.de> wrote:
> I am analyzing survival times with no time-varying co-variates. At the moment, I am using a Cox proportional hazards model based on STATA's stcox.
>
> The data is characterized as follows:
>
> For all observations in the sample it is known when an individual joined the risk pool, i.e., starting dates are known for all observations. Basically, spells can be terminated by two different outcomes A and B. Unfortunately, I only observe one of those two outcomes, A. For those cases, I also know when A happened and I can compute the duration of spells ending in A as (date of A minus entry date).
>
> For the remaining observations it is impossible to determine whether the spell already was terminated by event B or whether the observation is still at risk.
>
> Due to this data structure it seems unreasonable to treat observations that didn't end in A as censored observations as I cannot know whether they are still in the risk pool (here duration would be date today minus entry date) or whether they left the risk pool to destination B (then duration would be date of B minus entry date).
>
> Currently, I am estimating the Cox model only for observations that ended in A excluding all other observations from the estimation. As a robustness check, I also estimate a Heckman selection model where the selection is defined over (spell ended in A yes/no) and duration is the dependent variable in stage 2. Results of both exercises are comparable.
>
> Is anyone aware of how to deal with this problem in a better way? Or some literature looking at potential biases from excluding observations with unknown spell-endings? Thanks for your support!
>
> Stefan
>
>
> **************************************************************************************
> Stefan Wagner
>
> INNO-tec
> Institut für Innovationsforschung, Technologiemanagement und Entrepreneurship
> Ludwig-Maximilians-Universität München
> Kaulbachstr. 45/III
> 80539 München
> Tel.: ++49/89/2180-2877
> Fax: ++49/89/2180-6284
> swagner@bwl.lmu.de
> http://www.inno-tec.de/personen/mitarbeiter/wagner/index.html
>
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```