Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: ROC-curves


From   "Roger B. Newson" <r.newson@imperial.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: ROC-curves
Date   Mon, 21 Oct 2013 22:08:40 +0100

The main problem with confidence intervals for the area under a ROC generated from a logistic regression is that, if you estimate your ROC from the same data in which you fitted your logistic regression model, then you will probably be over-optimistic, as the parameters have been chosen to fit specifically that set of data. If you want your ROC area to have confidence limits which you can really be confident about, then it is a good idea to randomize your data into a training set and a test set, and to fit your logistic model to the training set, and to estimate its ROC area using out-of-sample prediction in the test set.

Newson (2010) discusses these issues with Cox regression and other survival models. As stated in the first paragraph of Section 5 of this reference, the procedure with non-survival models (like logistic regression) is similar, but similar.

I hope this helps.

Best wishes

Roger

References

Newson RB. Comparing the predictive power of survival models using Harrell’s c or Somers’ D. The Stata Journal 2010; 10(3): 339–358. Download from
http://www.stata-journal.com/article.html?article=st0198

Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: r.newson@imperial.ac.uk
Web page: http://www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/

Opinions expressed are those of the author, not of the institution.

On 18/10/2013 09:23, Seed, Paul wrote:
On 14/10/2013 18:54, Ragnhild Bergene Skråstad wrote:
  > Hi!
  > I investigate how different tests, in combination, can predict a given
outcome.
  >
  > I have made a logistic model with the command "logistic" and plotted
the ROC-curve with the command "lroc". This cave me the ROC-curve and
the AUC. I wonder:
  > - how can I get the 95 % CI for this AUC?
  > and
  > - I would like to get the sensitivity at a given fixed false-positive
rate. Do I have to get all the coordinates on the ROC curve and identify
the one at the FPR at interest- and if so, how do I do that, or is it a
direct way to do this?
  > best wishes
  > Ragnhild B Skråstad

The simplest way to get CI for a roc curve following logistic regression
is to use -predict- and -roctab-:

* Start Stata commands *
logistic outcome <predictors>
capture drop pred
predict pred
roctab outcome pred

* End Stata commands *

* outcome and <predictors> are replaced as appropriate.
Much quicker and less trouble than bootstrapping.

To find the appropriate cutpoint for a given sensitivity you can use -centile- with -if-
centile pred if outcome == 1, centile(90)
Likewise for specificity
centile pred if outcome == 0, centile(10)

Best wishes,

Paul T Seed, Women's Health, KCL






*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index