Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Retaining survivors in survival analysis?

From   "James Harris" <[email protected]>
To   <[email protected]>
Subject   st: Retaining survivors in survival analysis?
Date   Sun, 22 Aug 2004 13:03:26 +1000

Dear statalisters

I have a problem that probably stems from a basic lack of understanding how exactly -stset- expects data to be constructed. I hope one of you is able to help...

I am looking at survival after coronary artery bypass surgery (CABG). 

The input data has a record for each surgery event (CABG) and death. People are uniquely identified by "mast_enc" . In this example, rows 15 and 16 are for a woman who had a CABG in April 1990, when she was aged 68. She died in August 2000, aged 78. 

. list mast_enc sex age Date seq ageBand IncidentCABG event in 10/20

     | mast_enc   sex   age        Date     ageBand   Incide~G   event |
 15. | ........     F    68   03apr1990       65-74          1    CABG |
 16. | ........     F    78   19aug2000       75-84          0   Death |

The dataset is complete between 1 January 1988 and 31 December 2000. IncidentCABG marks the first surgery event (people can have multiple surgery events). I therefore declare this as survival time data with 

. stset Date, id(mast_enc) failure(event==2) enter(time d(1jan1988)) origin(IncidentCABG==1)  exit(time d(1jan2001)) scale(365.25) 

                id:  mast_enc
     failure event:  event == 2
obs. time interval:  (Date[_n-1], Date]
 enter on or after:  time d(1jan1988)
 exit on or before:  time d(1jan2001)
    t for analysis:  (time-origin)/365.25
            origin:  IncidentCABG==1

   355412  total obs.
   330154  ignored because never entered
        7  obs. end on or before enter()
    17971  obs. end on or before origin()
     3916  obs. begin on or after exit
     3364  obs. remaining, representing
     3325  subjects
     3081  failures in single failure-per-subject data
 12891.04  total analysis time at risk, at risk from t =         0
                             earliest observed entry t =         0
                                  last observed exit t =  13.02396

Only records after the first CABG are marked with  _st==1, ie as being included in the survival time analysis. Record 16 shows a span of 10.37 years from the date of surgery to the date of death. 

. list mast_enc age Date ageBand IncidentCABG event  _st _d _origin _t _t0 in 10/20

     | mast_enc age      Date ageBand Incide~G  event   _st   _d   _origin         _t  _t0 |
 15. | ........  68 03apr1990   65-74       1    CABG     0    .     11050          .    . |
 16. | ........  78 19aug2000   75-84       0   Death     1    1     11050   10.37974    0 |

The problem with this is that any survival analysis excludes people who had 1 or more CABG surgery events, but survived until after the exit date. 

I presume the problem is something do with my using event rather than true time-span data. However while I have tried using -snapspan- to convert this to timespan data, I have clearly missed some important point -- any advice gratefully received!

James Harris
National Centre for Epidemiology and Population Health
Building 62
The Australian National University
CANBERRA ACT 0200 Australia
CRICOS Provider #00120C

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index