[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Austin Nichols" <[email protected]> |

To |
[email protected] |

Subject |
Re: st: impact of bootstrapping on predicted probabilities of logistic regression models |

Date |
Fri, 3 Oct 2008 13:16:01 -0400 |

G. ter Riet <[email protected]>: How about something like webuse nhanes2, clear ren sex v1 g v2=race==2 if !mi(race) g v3=region==3 if !mi(race) ren diabetes y egen c=group(strata psu) cap pr drop rocb program rocb, rclass logit `1' `2' `3' `4' [pw=finalwgt] predict p roctab `1' p ret scalar rocb = r(area) levelsof g, loc(gs) foreach i of loc gs { su p if g==`i' ret scalar p`i'=r(mean) } drop p eret clear end egen g=group(v1 v2 v3), label levelsof g, loc(gs) foreach i of loc gs { loc p`i' "`:label (g) `i''" } loc r "r(rocb) r(p1) r(p2) r(p3) r(p4) r(p5) r(p6) r(p7) r(p8)" bs `r', reps(200) strat(g) sav(p) cl(c): rocb y v1 v2 v3 use p, clear forv i=1/8 { la var _bs_`=`i'+1' "`p`i''" } d su On Fri, Oct 3, 2008 at 10:02 AM, G. ter Riet <[email protected]> wrote: > Dear Statalisters, > In medical research on prediction or diagnosis, we often use the bootstrap to calculate confidence intervals for an area under the curve, AUC, corresponding to a particular logistic regression model that is used for prediction of an event (e.g. death or some target illness). The AUC is a global measure for how well the model discriminates between those with an without an event. A program that does this might look as follows: > > capture program drop rocb > program rocb, rclass > logit `1' `2' `3' `4' > predict p > roctab `1' p > drop p > return scalar rocb = r(area) > end > bootstrap auc=r(rocb), reps(200): rocb depvar indepvar1 indepvar2 indepvar3 > > What I should like to do, however, is to give readers of my paper an impression of the impact of bootstrapping, not just on the AUC, but on the distribution of predicted probabilities calculated from the logistic model since most clinicians are not that comfortable with AUCs of a ROC curve. Suppose, my indepvars are all binary. Then I'd have 2^3=8 covariate patterns and potentially a unique predicted probability for each covariate pattern for each bootstrap sample. For each covariate pattern, I'd like to average the predicted probabilities across the 200 samples (and perhaps say something about their variability). > My programming abilities of Stata are not good enough to solve this efficiently. Any help I'd greatly appreciate. Of course any comments on whether you think the whole idea is worthwhile are welcome too. > Cheers, Gerben ter Riet (epidemiologist, AMC, Dept General Practice, Amsterdam, NL) * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:

- Prev by Date:
**Re: st: Re: saving memory local vs usual variable** - Next by Date:
**RE: st: Re: saving memory local vs usual variable** - Previous by thread:
**Re: st: impact of bootstrapping on predicted probabilities of logistic regression models** - Next by thread:
**Re: st: impact of bootstrapping on predicted probabilities of logisticregression models** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |