Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Comparing ROC curves after bootstrap optimism


From   "Colin Cooke" <crcooke@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: Comparing ROC curves after bootstrap optimism
Date   Sat, 29 Apr 2006 15:27:29 -0700

I have generated a "new" predictive model ( with multiple variables)
and am interested in comparing the ROC curves from my new model to one
generated using a single "old" variable (which is not incorporated in
my new model).  Understanding the downsides to stepwise variable
reduction in prediction models I have nontheless chosen to use it.

Ive bootstrapped the "optimism" (internal validation) in the ROC from
my new model with the following code:
gen freq = .
program define optimism, rclass
version 9
    bsample 1023, weight(freq)
    stepwise, pe(0.049) pr(0.05) lr : logit y x1 x2 x3 x4 x5
x6.....x15 [fw=freq]
    lroc, nograph
           return scalar area1 = r(area)
           local a1 = r(area)
  predict p
     roctab y p      /*calculate aROC on the full data using model
derived on bootstrap sample */
     return scalar area2 = r(area)
     local a2 = r(area)
  return scalar dif = `a1' - `a2'
drop p
end
simulate area1=r(area1) area2=r(area2) dif=r(dif), reps(200)
seed(1234): optimism

I then report the optimism corrected ROC as that generated by the original data:

stepwise, pe(0.049) pr(0.05)  lr : logit y x1 x2 x3 x4 x5 x6.....x15
lroc, nograph

The AUC for this equals 0.77 and the optimism generated by the above
bootstrap = 0.03
then my optimism corrected estimate is = 0.77 - 0.3 = 0.74.

The AUC generated by using the alternative covariate on the same data = 0.70.

I would like to statistically compare the two AUCs, but the problem is
that by manipulating the AUC post test (subtracting the bootstrap
optimism) I can no longer
perform statistical tests of comparison between the two models.

Does anyone know of a way to be able to perform tests of comparison
between ROC curves on the same data after the
areas have been manipulated like they have been here?

Thanks in advance.

-CC

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index