Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Survival Analysis: Censoring at end of study period

From   "Steichen, Thomas J." <>
To   <>
Subject   st: Survival Analysis: Censoring at end of study period
Date   Wed, 14 Mar 2007 16:58:19 -0400


I'm exploring some survival data.

In this study, all subjects enter at time zero and are
potentially followed for 30 weeks, being examined once per week.

Some subjects drop out and are censored. For example, a subject
may appear and be examined in the 8th week but not appear
for examination in the 9th (and subsequent) weeks. I would mark
this subject as censored in the 9th week, as this is when we have
no knowledge of the subject's "state". In Stata -stset- 
terminology, the record for this subject reads:

     _t0 = 0, _t = 9, _d = 0, _st = 1

That is, the subject entered the study at time _t0 = 0 and exited 
the study at time _t = 9 with status _d = 0 (indicating censored).
(The last -stset- value, _st = 1, means the record is to be used.)

If the subject experiences the event of interest (a "failure" in
survival terminology), _t is set to the week the failure was observed
and _d is set to 1, indicating a failure; the subject is no longer
followed. So if the subject is observed as a failure in week 11, 
the -stset- record reads:

    _t0 = 0, _t = 11, _d = 1, _st = 1

where _d = 1 indicates failure.

The last possible record type is a special case of the first one...

If a subject appears for all 30 weeks and is not observed to fail,
the subject must be censored; my question is whether this censoring
Should be marked in the 30th week or in the (unobserved) 31st week?

Is the record:

    _t0 = 0, _t = 30, _d = 0, _st = 1


    _t0 = 0, _t = 31, _d = 0, _st = 1

This choice affects the calculated statistics so is of importance.

[Clearly, if I ask this question, I also must ask whether I'm marking
censoring during the study correctly. Was the first example (above)
censored at week 9 or week 8?]

These question arose, in part, because I used -stci- to compute
the (restricted) mean survival time (via -stci , rmean-) for a 
treatment group in which no failures occurred. I had set
_t = 31 for subjects censored at the end of the study and observed 
that the (restricted) mean survival time was 31 weeks... I had 
expected 30 weeks.

Changing to _t = 30 gets a mean survival time of 30 weeks. So, which
is correct?

As a related note, the median survival time (via -stci-) was
undefined. My initial thought is that since all subjects survived
the full 30 weeks, the median should have been 30. I suspect the
undefined result occurs because the Kaplan-Meier product-limit
estimate of the survival function (the underlying source for these 
percentiles) is undefined... But I'm not convinced it should be. 

Any and all opinions on these topics will be appreciated.


Thomas J. Steichen

CONFIDENTIALITY NOTE: This e-mail message, including any
attachment(s), contains information that may be confidential,
protected by the attorney-client or other legal privileges, and/or
proprietary non-public information. If you are not an intended
recipient of this message or an authorized assistant to an intended
recipient, please notify the sender by replying to this message and
then delete it from your system. Use, dissemination, distribution,
or reproduction of this message and/or any of its attachments (if
any) by unintended recipients is not authorized and may be

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index