Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Tom Robinson <tomrobnz@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: Developing a Predictive Risk Equation from stcox survival analysis |

Date |
Wed, 19 Sep 2012 08:53:45 +1200 |

Hi, I am using stcox to develop a predictive risk model but am unsure about how to formulate the final equation. I am using Stata 12.1 I have independent variables that were collected by family physicians as part of routine care e.g. blood pressure, lipids, renal function, demographic variables, time since developing diabetes . These come from a single review and I am using this review date as onset. The outcome is new onset of end-stage renal failure which is collected from a range of national datasets (in New Zealand). I have developed a model using stcox which I'm happy with but need to turn this into a risk prediction equation for risk at 5 years after which I can use in a validation dataset. I have centered all the variables around their mean. What I have done so far is: (following Tangri N, Stevens LA, Griffith J, et al. A predictive model for progression of chronic kidney disease to kidney failure. JAMA. 2011;305(15):1553-1559.appendix) - use predict *newvar*, xb to calculate each individuals overall hazard coefficient - confirmed for myself that this is equivalent to the sum of each variable multiplied by its coefficient from the model - confirmed that a dummy individual X with all the independent variables set at 0 (in other words at the means) has a overall hazard of 0 (*newvar *) - used predict *newvar2*, basesurv to calculate the baseline survivals - set individual X _t to 5 years which is the time period I'm interested in predicting risk at. This individuals baseline survival is Y - Used this survival in the equation gen risk5yr=1-(Y)^exp(*newvar*) to calculate each persons risk of the event at 5 years My problem is that the when I run an estat concordance on my model I get a higher Harrel's C than I do when I run roctab on my outcome and the risks I have calculated (using the development dataset still). I have also run a calibration analysis on my calculated risks which is wildly wrong (the predicted risks in each decile are about half of the actual risks) Clearly I'm doing something wrong but I can't see what. Thanks for any advice -- *Tom Robinson* * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Developing a Predictive Risk Equation from stcox survival analysis***From:*Phil Clayton <philclayton@internode.on.net>

- Prev by Date:
**Re: st: -gmm- heckman problem** - Next by Date:
**RE: st: -gmm- Heckman problem** - Previous by thread:
**st: -gmm- heckman problem** - Next by thread:
**Re: st: Developing a Predictive Risk Equation from stcox survival analysis** - Index(es):