# st: Retaining survivors in survival analysis?

 From "James Harris" To Subject st: Retaining survivors in survival analysis? Date Sun, 22 Aug 2004 13:03:26 +1000

Dear statalisters

I have a problem that probably stems from a basic lack of understanding how exactly -stset- expects data to be constructed. I hope one of you is able to help...

I am looking at survival after coronary artery bypass surgery (CABG).

The input data has a record for each surgery event (CABG) and death. People are uniquely identified by "mast_enc" . In this example, rows 15 and 16 are for a woman who had a CABG in April 1990, when she was aged 68. She died in August 2000, aged 78.

. list mast_enc sex age Date seq ageBand IncidentCABG event in 10/20

+-----------------------------------------------------------------+
| mast_enc   sex   age        Date     ageBand   Incide~G   event |
|-----------------------------------------------------------------|
15. | ........     F    68   03apr1990       65-74          1    CABG |
16. | ........     F    78   19aug2000       75-84          0   Death |
+-----------------------------------------------------------------+

The dataset is complete between 1 January 1988 and 31 December 2000. IncidentCABG marks the first surgery event (people can have multiple surgery events). I therefore declare this as survival time data with

. stset Date, id(mast_enc) failure(event==2) enter(time d(1jan1988)) origin(IncidentCABG==1)  exit(time d(1jan2001)) scale(365.25)

id:  mast_enc
failure event:  event == 2
obs. time interval:  (Date[_n-1], Date]
enter on or after:  time d(1jan1988)
exit on or before:  time d(1jan2001)
t for analysis:  (time-origin)/365.25
origin:  IncidentCABG==1

------------------------------------------------------------------------------
355412  total obs.
330154  ignored because never entered
7  obs. end on or before enter()
17971  obs. end on or before origin()
3916  obs. begin on or after exit
------------------------------------------------------------------------------
3364  obs. remaining, representing
3325  subjects
3081  failures in single failure-per-subject data
12891.04  total analysis time at risk, at risk from t =         0
earliest observed entry t =         0
last observed exit t =  13.02396

Only records after the first CABG are marked with  _st==1, ie as being included in the survival time analysis. Record 16 shows a span of 10.37 years from the date of surgery to the date of death.

. list mast_enc age Date ageBand IncidentCABG event  _st _d _origin _t _t0 in 10/20

+-------------------------------------------------------------------------------------+
| mast_enc age      Date ageBand Incide~G  event   _st   _d   _origin         _t  _t0 |
|-------------------------------------------------------------------------------------|
15. | ........  68 03apr1990   65-74       1    CABG     0    .     11050          .    . |
16. | ........  78 19aug2000   75-84       0   Death     1    1     11050   10.37974    0 |
+-------------------------------------------------------------------------------------+

The problem with this is that any survival analysis excludes people who had 1 or more CABG surgery events, but survived until after the exit date.

I presume the problem is something do with my using event rather than true time-span data. However while I have tried using -snapspan- to convert this to timespan data, I have clearly missed some important point -- any advice gratefully received!

==========================================

James Harris
National Centre for Epidemiology and Population Health
Building 62
The Australian National University
CANBERRA ACT 0200 Australia
CRICOS Provider #00120C

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/