Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: Goodness of fit using Cox-snell residuals


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: Goodness of fit using Cox-snell residuals
Date   Thu, 1 Jun 2006 16:03:49 +0100

It just makes no sense to feed the Cox-Snell
residuals to -stset-. You already set up 
the survival problem using -date_visit-. 

Once you have the Cox-Snell residuals, there 
are various things you can usefully do with them, but 
feeding them to -stset- is not one of those
things. 

As -stset-is telling you, many of the residuals 
are negative, so the operation makes no sense on that ground 
alone.. 

Nick 
n.j.cox@durham.ac.uk 

Emelda Okiro
 
> Calrification
> Am using stata 8
> This is what my data looks like
> 
> Id    sex  date_visit        age        failure    
> 1     0     04jun2004         28          0
>   1     0    12jun2004         28          0 
>    1     0   18jun2004         28          0 
>    1     0   16jul2004         29          0
>   1     0   13aug2004         30          0 
>    2     0   01mar2002         0          0 
>    2     0   27mar2002         1          0 
>   2     0   15apr2002          2          0 
>   2     0   18apr2002          2          1 
>   2     0   29apr2002          2          0 
> 
> basic time scale is calender time declared on the stset 
> origin and scale control the mapping from the basic time 
> scale onto the
> time scale on which the analysis is to be performed
> . 
> . stset date_visit, id (rsv) failure(lrti) enter(time
> date_origin)origin(time d(31jan2002)) exit(time date_exit) scale(1)
> 
>                 id:  rsv
>      failure event:  lrti != 0 & lrti < .
> obs. time interval:  (date_visit[_n-1], date_visit]
>  enter on or after:  time date_origin
>  exit on or before:  time date_exit
>     t for analysis:  (time-origin)
>             origin:  time d(31jan2002)
> 
> --------------------------------------------------------------
> ----------------
>     29979  total obs.
>         0  exclusions
> --------------------------------------------------------------
> ----------------
>     29979  obs. remaining, representing
>       469  subjects
>       952  failures in multiple failure-per-subject data
>    377180  total analysis time at risk, at risk from t =         0
>                              earliest observed entry t =         0
>                                   last observed exit t =      1177
> 
>  
> 
> . **** Checking the goodness of fit of the final model
> . * evaluated by using Cox-Snell residuals
> . * if the model fits the data well then the true cumulative hazard
> function conditional on the covariate vector should have an 
> exponential
> distribution with a hazard rate of one
> . quietly xi: stcox i.currentagegrp sex i.siblings_un6 i.main_fuel
> i.hse_toilet i.babies_bor i.education i.family_children
> i.interaction_un6 i.siblingssch_un6 i.siblingsroom_ov6 i.female_sibs
> poor  i.weaning i.job_desc, nohr mgale(mg)
> 
> . * compute cox-snell residuals
> . predict cs, csnell
> (663 missing values generated)
> 
> . *re stset using cs residuals as the time variable (look at 
> the output)
> the missing values are truly missing but it is omitting some of the
> observations ????? It is also assuming single failure single record
> which is incorrect as shown above my data set has multiple records
> multiple failure-per-subject data.
> 
> . stset cs, failure(lrti)
> 
>      failure event:  lrti != 0 & lrti < .
> obs. time interval:  (0, cs]
>  exit on or before:  failure
> 
> --------------------------------------------------------------
> ----------------
>     29979  total obs.
>       663  event time missing (cs>=.)                         
>   PROBABLE
> ERROR
>      1046  obs. end on or before enter()
> --------------------------------------------------------------
> ----------------
>     28270  obs. remaining, representing
>       925  failures in single record/single failure data
>       925  total analysis time at risk, at risk from t =         0
>                              earliest observed entry t =         0
>                                   last observed exit t =  .8936376
> 
> Does anyone know how cs residuals are computed in this kind 
> of data and
> how I can specify multiple failure multiple recors when using cs
> residuals as the time variable

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index