Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Roger B. Newson" <r.newson@imperial.ac.uk> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Developing a Predictive Risk Equation from stcox survival analysis |

Date |
Wed, 19 Sep 2012 15:36:59 +0100 |

Best wishes Roger References

http://www.stata-journal.com/article.html?article=st0198 or download pre-publication draft from http://www.imperial.ac.uk/nhli/r.newson/papers.htm#papers_in_journals Roger B Newson BSc MSc DPhil Lecturer in Medical Statistics Respiratory Epidemiology and Public Health Group National Heart and Lung Institute Imperial College London Royal Brompton Campus Room 33, Emmanuel Kaye Building 1B Manresa Road London SW3 6LR UNITED KINGDOM Tel: +44 (0)20 7352 8121 ext 3381 Fax: +44 (0)20 7351 8322 Email: r.newson@imperial.ac.uk Web page: http://www.imperial.ac.uk/nhli/r.newson/ Departmental Web page: http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/ Opinions expressed are those of the author, not of the institution. On 19/09/2012 15:03, Steve Samuels wrote:

Tom Robinson:My problem is that the when I run an estat concordance on my model I get a higher Harrel's C than I do when I run roctab on my outcome and the risks I have calculated (using the development dataset still).This is not surprising behavior: -roctab- is for binary outcomes; it ignores censoring and time. Steve On Sep 18, 2012, at 9:13 PM, Phil Clayton wrote: After running your Cox model: predict double xbeta, xb predict double basesurv, basesurv Now you need to know the baseline survivor function at 5 years. The mistake you've made is that this number is actually the same for everyone. Try this: line basesurv _t, sort You just need the point on the curve where _t==5. Since the baseline survivor function only goes down with time, this point is the minimum basesurv when time is less than 5 years: sum basesurv if _t<5 scalar base5y=r(min) Finally you can calculate each patient's risk at 5 years by adjusting the baseline risk: gen risk5y=1 - base5y^exp(xbeta) You can of course avoid the use of the scalar, but the above code makes it a little clearer what you're doing. Here's the abbreviated version: sum basesurv if _t<5 gen risk5y=1 - r(min)^exp(xbeta) A word on precision - you're raising a small number (between 0 and 1) to the exponent of another number. Therefore minor problems with precision rapidly become very big ones. This is why I have suggested using double precision for the xbeta and basesurv variables. It's also very useful to centre your covariates, so that the baseline survivor function represents an "average" patient rather than a patient with extremely or impossibly low risk. See "Making baseline reasonable" in [ST] stcox postestimation. Phil On 19/09/2012, at 6:23 AM, Tom Robinson wrote:Hi, I am using stcox to develop a predictive risk model but am unsure about how to formulate the final equation. I am using Stata 12.1 I have independent variables that were collected by family physicians as part of routine care e.g. blood pressure, lipids, renal function, demographic variables, time since developing diabetes . These come from a single review and I am using this review date as onset. The outcome is new onset of end-stage renal failure which is collected from a range of national datasets (in New Zealand). I have developed a model using stcox which I'm happy with but need to turn this into a risk prediction equation for risk at 5 years after which I can use in a validation dataset. I have centered all the variables around their mean. What I have done so far is: (following Tangri N, Stevens LA, Griffith J, et al. A predictive model for progression of chronic kidney disease to kidney failure. JAMA. 2011;305(15):1553-1559.appendix) - use predict *newvar*, xb to calculate each individuals overall hazard coefficient - confirmed for myself that this is equivalent to the sum of each variable multiplied by its coefficient from the model - confirmed that a dummy individual X with all the independent variables set at 0 (in other words at the means) has a overall hazard of 0 (*newvar *) - used predict *newvar2*, basesurv to calculate the baseline survivals - set individual X _t to 5 years which is the time period I'm interested in predicting risk at. This individuals baseline survival is Y - Used this survival in the equation gen risk5yr=1-(Y)^exp(*newvar*) to calculate each persons risk of the event at 5 years My problem is that the when I run an estat concordance on my model I get a higher Harrel's C than I do when I run roctab on my outcome and the risks I have calculated (using the development dataset still). I have also run a calibration analysis on my calculated risks which is wildly wrong (the predicted risks in each decile are about half of the actual risks) Clearly I'm doing something wrong but I can't see what. Thanks for any advice -- *Tom Robinson* * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Developing a Predictive Risk Equation from stcox survival analysis***From:*Tom Robinson <tomrobnz@gmail.com>

**References**:**st: Developing a Predictive Risk Equation from stcox survival analysis***From:*Tom Robinson <tomrobnz@gmail.com>

**Re: st: Developing a Predictive Risk Equation from stcox survival analysis***From:*Phil Clayton <philclayton@internode.on.net>

**Re: st: Developing a Predictive Risk Equation from stcox survival analysis***From:*Steve Samuels <sjsamuels@gmail.com>

- Prev by Date:
**st: Create Variable containing Generations per Household** - Next by Date:
**Re: st: Create Variable containing Generations per Household** - Previous by thread:
**Re: st: Developing a Predictive Risk Equation from stcox survival analysis** - Next by thread:
**Re: st: Developing a Predictive Risk Equation from stcox survival analysis** - Index(es):