[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: impact of bootstrapping on predicted probabilities of logistic regression models

From	Maarten buis <[email protected]>
To	[email protected]
Subject	Re: st: impact of bootstrapping on predicted probabilities of logistic regression models
Date	Fri, 3 Oct 2008 15:35:33 +0100 (BST)

I don't think it is a bad idea, but I don't have time to implement it.
The one thing that would worry me is that it is likely that not all
combinations are present in all bootstrap samples. This can become 
especially problematic if some combinations are rare, so that these
rogue samples become very common. A useful alternative would be to look
at -prchange-, which is part of the -spost package-, see -findit
spost-, and -mfx-, see: -help mfx-.

-- Maarten

--- "G. ter Riet" <[email protected]> wrote:
> In medical research on prediction or diagnosis, we often use the
> bootstrap to calculate confidence intervals for an area under the
> curve, AUC, corresponding to a particular logistic regression model
> that is used for prediction of an event (e.g. death or some target
> illness). The AUC is a global measure for how well the model
> discriminates between those with an without an event. A program that
> does this might look as follows:
> 
> capture program drop rocb
> program rocb, rclass
> logit `1' `2' `3' `4' 
> predict p
> roctab `1' p
> drop p
> return scalar rocb = r(area)
> end
> bootstrap auc=r(rocb), reps(200): rocb  depvar indepvar1 indepvar2
> indepvar3
> 
> What I should like to do, however, is to give readers of my paper an
> impression of the impact of bootstrapping, not just on the AUC, but
> on the distribution of predicted probabilities calculated from the
> logistic model since most clinicians are not that comfortable with
> AUCs of a ROC curve.  Suppose, my indepvars are all binary. Then I'd
> have 2^3=8 covariate patterns and potentially a unique predicted
> probability for each covariate pattern for each bootstrap sample. For
> each covariate pattern, I'd like to average the predicted
> probabilities across the 200 samples (and perhaps say something about
> their variability).
> My programming abilities of Stata are not good enough to solve this
> efficiently. Any help I'd greatly appreciate. Of course any comments
> on whether you think the whole idea is worthwhile are welcome too.
> Cheers, Gerben ter Riet (epidemiologist, AMC, Dept General Practice,
> Amsterdam, NL)
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 


-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room N515

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------


      
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: impact of bootstrapping on predicted probabilities of logisticregression models
  - From: "G. ter Riet" <[email protected]>

Prev by Date: Re: st: corr_svy & covariance matrix for survey data
Next by Date: Re: st: creating average over past three years within groups
Previous by thread: st: impact of bootstrapping on predicted probabilities of logisticregression models
Next by thread: Re: st: impact of bootstrapping on predicted probabilities of logistic regression models
Index(es):
- Date
- Thread