Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Steve Rothenberg" <drlead@prodigy.net.mx> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | st: RE: Many strata in Cox proportional hazard models |
Date | Fri, 15 Jul 2011 14:01:43 -0500 |
Given the very large range of age (49-94 years) in this mortality study and the fact that there is a positive association between the exposure variable of interest and age, it would seem a reasonable assumption that baseline hazard will vary with age. Specifying age strata allows different baseline hazards for each age group while constraining the coefficients across strata to be equal. The question regarding the validity of proportional odds assumption with this data set, especially among the different exposure groups, is a matter for diagnostic tests; it may hold for the different exposure groups or it may not. With 9 5-year age strata it is feasible to test the proportional odds assumption for the exposure variable in each strata; with 353 monthly age strata it is much more difficult, not to mention the many one or two subject monthly strata in which no one dies during the observation interval and the hazard function is flat. I welcome comments on this issue, as well as the two questions I raised in the first post of this thread. Steve Rothenberg National Institute of Public Health Cuernavaca, Morelos Mexico Steve, Also I don't think you need to consider that the hazard varies with age. I think the proportional hazards takes care of this. David J Svendsgaard, PhD Biostatistician EPA/ORD/NCEA/RTP, Mail Drop B-243-01 Research Triangle Park, NC 27711 Phone (919) 541-4186 Fax (919) 541-1818 -----Mensaje original----- De: Steve Rothenberg [mailto:drlead@prodigy.net.mx] Enviado el: Friday, July 15, 2011 1:23 PM Para: 'statalist@hsphsun2.harvard.edu' Asunto: RE: Many strata in Cox proportional hazard models All variables except date and cause of death and date of right-censoring were collected at baseline (entry to study). The models were built on the command line and no variables were identified as time varying. Steve Rothenberg National Institute of Public Health Cuernavaca, Morelos Mexico Did you check the time-varying covariates box? David J Svendsgaard, PhD Biostatistician EPA/ORD/NCEA/RTP, Mail Drop B-243-01 Research Triangle Park, NC 27711 Phone (919) 541-4186 Fax (919) 541-1818 -----Mensaje original----- De: Steve Rothenberg [mailto:drlead@prodigy.net.mx] Enviado el: Friday, July 15, 2011 12:55 PM Para: 'statalist@hsphsun2.harvard.edu' Asunto: Many strata in Cox proportional hazard models I'm trying to diagnose the proportional hazards assumption (and other diagnostics) on Cox PH models. I use a data set of mortality in 1154 subjects, with only baseline data measured at entry, and 185 failures during the ~8 year follow up. The age range of the group at entry is 49 to 94 years and there are three ordered categories of exposure to the variable of interest. All other independent variables in the model are dichotomous. Since I expect baseline hazard to differ by age, I'm using age as the stratification variable in stratified estimation. A colleague has suggested I use monthly age strata. I obtain 353 monthly strata each with from 1 to 10 subjects, an average N in each strata ~ 3. I've tried an alternative strata division of 5-year age periods, 9 stratum groups in all with from 5 to 282 subjects in each group, an average N in each strata ~ 125. The models with the monthly age strata all return hazard ratios for the exposure variable below 1, compared to the lowest, reference exposure variable. The models with the 5-year age strata all return hazard ratios for the exposure variables higher than 1. Fit measures, such as AIC and BIC are far better in the monthly stratified model than in the 5-year stratified model. The concordance index (Harrold's C) is .602 for the 5-year age strata model and .615 in the monthly age strata model. I've checked proportional hazards assumptions for the exposure variables on the each stratum of the 5-year strata model (along with other diagnostics) and get good compliance on 7 of the 9 strata. I don't know where and how to begin checking the PH assumption on the monthly age strata model, due to the large number of strata and the fact that there are so many strata with just 1 or 2 subjects. I suspect overkill with monthly strata but wonder: Question 1: Are there statistical drawbacks (other than diagnostic) to using so many strata for survival models, especially with many singleton strata? Question 2: Can anyone suggest available literature that discusses this issue? Steve Rothenberg National Institute of Public Health Cuernavaca, Morelos Mexico * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/