Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Steve Rothenberg" <drlead@prodigy.net.mx> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | st: RE: Many strata in Cox proportional hazard models |
Date | Fri, 15 Jul 2011 13:23:09 -0500 |
All variables except date and cause of death and date of right-censoring were collected at baseline (entry to study). The models were built on the command line and no variables were identified as time varying. Steve Rothenberg National Institute of Public Health Cuernavaca, Morelos Mexico Did you check the time-varying covariates box? David J Svendsgaard, PhD Biostatistician EPA/ORD/NCEA/RTP, Mail Drop B-243-01 Research Triangle Park, NC 27711 Phone (919) 541-4186 Fax (919) 541-1818 -----Mensaje original----- De: Steve Rothenberg [mailto:drlead@prodigy.net.mx] Enviado el: Friday, July 15, 2011 12:55 PM Para: 'statalist@hsphsun2.harvard.edu' Asunto: Many strata in Cox proportional hazard models I'm trying to diagnose the proportional hazards assumption (and other diagnostics) on Cox PH models. I use a data set of mortality in 1154 subjects, with only baseline data measured at entry, and 185 failures during the ~8 year follow up. The age range of the group at entry is 49 to 94 years and there are three ordered categories of exposure to the variable of interest. All other independent variables in the model are dichotomous. Since I expect baseline hazard to differ by age, I'm using age as the stratification variable in stratified estimation. A colleague has suggested I use monthly age strata. I obtain 353 monthly strata each with from 1 to 10 subjects, an average N in each strata ~ 3. I've tried an alternative strata division of 5-year age periods, 9 stratum groups in all with from 5 to 282 subjects in each group, an average N in each strata ~ 125. The models with the monthly age strata all return hazard ratios for the exposure variable below 1, compared to the lowest, reference exposure variable. The models with the 5-year age strata all return hazard ratios for the exposure variables higher than 1. Fit measures, such as AIC and BIC are far better in the monthly stratified model than in the 5-year stratified model. The concordance index (Harrold's C) is .602 for the 5-year age strata model and .615 in the monthly age strata model. I've checked proportional hazards assumptions for the exposure variables on the each stratum of the 5-year strata model (along with other diagnostics) and get good compliance on 7 of the 9 strata. I don't know where and how to begin checking the PH assumption on the monthly age strata model, due to the large number of strata and the fact that there are so many strata with just 1 or 2 subjects. I suspect overkill with monthly strata but wonder: Question 1: Are there statistical drawbacks (other than diagnostic) to using so many strata for survival models, especially with many singleton strata? Question 2: Can anyone suggest available literature that discusses this issue? Steve Rothenberg National Institute of Public Health Cuernavaca, Morelos Mexico * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/