Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: ROC-curves

From	"Roger B. Newson" <[email protected]>
To	[email protected]
Subject	Re: st: ROC-curves
Date	Mon, 21 Oct 2013 22:08:40 +0100

The main problem with confidence intervals for the area under a ROCgenerated from a logistic regression is that, if you estimate your ROCfrom the same data in which you fitted your logistic regression model,then you will probably be over-optimistic, as the parameters have beenchosen to fit specifically that set of data. If you want your ROC areato have confidence limits which you can really be confident about, thenit is a good idea to randomize your data into a training set and a testset, and to fit your logistic model to the training set, and to estimateits ROC area using out-of-sample prediction in the test set.

Newson (2010) discusses these issues with Cox regression and othersurvival models. As stated in the first paragraph of Section 5 of thisreference, the procedure with non-survival models (like logisticregression) is similar, but similar.


I hope this helps.

Best wishes

Roger

References

Newson RB. Comparing the predictive power of survival models usingHarrell’s c or Somers’ D. The Stata Journal 2010; 10(3): 339–358.Download from

http://www.stata-journal.com/article.html?article=st0198

Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: [email protected]
Web page: http://www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/

Opinions expressed are those of the author, not of the institution.

On 18/10/2013 09:23, Seed, Paul wrote:

On 14/10/2013 18:54, Ragnhild Bergene Skråstad wrote:
  > Hi!
  > I investigate how different tests, in combination, can predict a given
outcome.
  >
  > I have made a logistic model with the command "logistic" and plotted
the ROC-curve with the command "lroc". This cave me the ROC-curve and
the AUC. I wonder:
  > - how can I get the 95 % CI for this AUC?
  > and
  > - I would like to get the sensitivity at a given fixed false-positive
rate. Do I have to get all the coordinates on the ROC curve and identify
the one at the FPR at interest- and if so, how do I do that, or is it a
direct way to do this?
  > best wishes
  > Ragnhild B Skråstad

The simplest way to get CI for a roc curve following logistic regression
is to use -predict- and -roctab-:

* Start Stata commands *
logistic outcome <predictors>
capture drop pred
predict pred
roctab outcome pred

* End Stata commands *

* outcome and <predictors> are replaced as appropriate.
Much quicker and less trouble than bootstrapping.

To find the appropriate cutpoint for a given sensitivity you can use -centile- with -if-
centile pred if outcome == 1, centile(90)
Likewise for specificity
centile pred if outcome == 0, centile(10)

Best wishes,

Paul T Seed, Women's Health, KCL






*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: st: ROC-curves
  - From: "Seed, Paul" <[email protected]>

Prev by Date: st: Panel VAR: Unit root tests/Cointegration
Next by Date: st: Correlation matrices using multiply imputed datasets
Previous by thread: Re: st: ROC-curves
Next by thread: Re: st: ROC-curves
Index(es):
- Date
- Thread