[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Jeph Herrin <junk@spandrel.net> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: repeat: predict after -streg- |

Date |
Thu, 12 Jul 2007 14:33:03 -0400 |

Bill, Sam: This is exactly the insight I was missing. Of course the predicted median is high, my crude median is high. And yes, the model is not great, even though all of my covars are highly (P<0.0001) significant. Any other thoughts on how to handle this? We are creating hospital rankings based on 30-day readmission rates; the problem is that hospitals which discharge patients too soon may be losing them to mortality before they have a chance to be readmitted, so I want to account for the competing risk of death vs readmission. Unfortunately, we can only follow them for 30 days, during which only 22.5% are readmitted. The answer seems to be, I can get estimated median time to readmission for each hospital, adjusted for my covars (perhaps refined), but these will not translate to 30day readmission rates because my model does nothing to discriminate between those readmitted before 30 days and those readmitted after, it only explains incremental differences in time to failure. thanks, Jeph William Gould, StataCorp LP wrote:

Jeph Herrin <junk@spandrel.net> writes,

My data are time to readmission (failure) after hospitalization, with censoring on death: -> stset ftime, failure(radm)failure event: radm != 0 & radm < . obs. time interval: (0, ftime] exit on or before: failureIt's possible. Over the period 0 to 30, you had only 95,529 failures in------------------------------------------------------------------- 424787 total obs. 0 exclusions ------------------------------------------------------------------- 424787 obs. remaining, representing 95529 failures in single record/single failure data 1.07e+07 total analysis time at risk, at risk from t = 0 earliest observed entry t = 0 last observed exit t = 30 My model is . streg `mycovars' , d(lnormal) And my problem is: . predict median, median time . count if median<=30 30 So, though the data have 95,000+ failures in less than 30 days, my model predicts only 30 such failures. [...] Should I expect this?

424,787 subjects, which is 95,529/424,787 = 22.5%. Thus, the median survival

time in your data is greater than 30.

I agree with Jeph that the model obviously did not fit the subjects observed to fail well. That could just indicate that `mycovars' does not explain the variation in the data well (which is different from saying that the covariates

are not significiantly different from zero). For instance, imagine rather

than fitting

. streg `mycovars', d(lnnormal)

Jeph had fitted

. streg, d(lnnormal)

That would be an constant-only model. Typing

. predict median, median time

after that would result in the same prediction for everyone, and that prediction would most certainly be greater than 30.

So my advice is "accept the result" or "worry about specification" or "find better covariates". (Concerning "worry about specification", who says age is adequate and don't need to include an effect for age>40. Or include age^2. Or use fractional polynomials?)

Before Jeph takes my advice, however, he will want to make sure that he does not have a technical problem. So far, I have merely said that nothing Jeph typed looks like an error to me and given the output Jeph has shared, the results shown are consistent with there being no data processing errors.

First, Jeph should type

. stsum

and

. sts list

and verify that the output is consistent with what he knows about his data. After that, I would also type

. summarize _t if _d==0

and

. summarize _t if _d==1

Jeph should verify that the recorded analytic times for the censored and the failed are consistent with what he knows about his data.

Finally, I would type

. summarize median if _d==0

and

. summarize median if _d==1

I would certainly want to see that the predicted median times for the censored group are, on average, greater than for the failed group.

I am assumming here that most of the failed group failed before time 30

and that most of the censored were censored at time 30.

Like I said, I suspect that the only problem Jeph is lack of explanatory power. Among `mycovars', Jeph may have uncovered variable or variables that have a significant effect, perhaps even an important effect, one survival, but even so, there is still a lot left to explain.

-- Bill

wgould@stata.com

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: repeat: predict after -streg-***From:*Steven Samuels <ssamuels@albany.edu>

**Re: st: repeat: predict after -streg-***From:*Steven Samuels <ssamuels@albany.edu>

**References**:**Re: st: repeat: predict after -streg-***From:*wgould@stata.com (William Gould, StataCorp LP)

- Prev by Date:
**Re: st: saving file** - Next by Date:
**st: Deleting objects in Stata 10 graphs** - Previous by thread:
**Re: st: repeat: predict after -streg-** - Next by thread:
**Re: st: repeat: predict after -streg-** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |