Home  /  Products  /  Features  /  ROC

Stata’s suite for ROC analysis consists of: roctab, roccomp, rocfit, rocgold, rocreg, and rocregplot.

Stata’s roctab provides nonparametric estimation of the ROC curve, and produces Bamber and Hanley confidence intervals for the area under the ROC curve.

Stata’s roccomp provides tests of equality of ROC areas. It can estimate nonparametric and parametric binormal ROC curves.

rocfit fits maximum likelihood models for a single classifier, an indicator of the latent binormal variable for the true status.

rocgold performs tests of equality of ROC area, against a “gold standard” ROC curve, and can adjust significance levels for multiple tests across classifiers via Sidak’s correction.

rocreg performs ROC regression, that is, it can adjust both sensitivity and specifity for prognostic factors such as age and gender; it is by far the most general of all the ROC commands.

rocregplot draws ROC curves as modeled by rocreg. ROC curves may be drawn across covariate values, across classifiers, and both.

Let's see it work

Norton et al. (2000) examined a neo-natal audiology study on hearing impairment. A hearing test was applied to children aged 30 to 53 months. It is believed that the classifier y1 (DPOAE 65 at 2kHz) becomes more accurate at older ages.

We use rocreg to fit a maximum likelihood model for this situation. The extra effect of current age on y1 when the child has hearing impairment is estimated by specifying roccov(). The control population effect of current age and gender of the child is estimated with the ctrlcov() option.

. webuse nnhs, clear
(Norton - neonatal audiology data)

. rocreg d y1, roccov(currage) ctrlcov(currage male) cluster(id) probit 
     ml nolog

Covariate control      : linear regression
Control variables      : currage male
Control standardization: normal
ROC method             : parametric               Link: probit

  Status     : d
  Classifiers: y1

  Classifier : y1
  Covariate control adjustment model:
                                 (Std. err. adjusted for 2,741 clusters in id)

Coefficient std. err. z P>|z| [95% conf. interval]
currage .494211 .2463657 2.01 0.045 .0113431 .977079
_cons -15.00403 9.384911 -1.60 0.110 -33.39812 3.390058
_cons 8.49794 .5366836 15.83 0.000 7.44606 9.549821
currage -.2032048 .0388917 -5.22 0.000 -.279431 -.1269785
male .2369359 .2573664 0.92 0.357 -.267493 .7413648
_cons -1.23534 1.487668 -0.83 0.406 -4.151116 1.680436
_cons 7.749156 .1113006 69.62 0.000 7.531011 7.967301
Status : d ROC Model : (Std. err. adjusted for 2,741 clusters in id)
Coefficient std. err. z P>|z| [95% conf. interval]
i_cons -1.765608 1.105393 -1.60 0.110 -3.932138 .4009225
currage .0581566 .0290177 2.00 0.045 .0012828 .1150303
s_cons .9118864 .0586884 15.54 0.000 .7968593 1.026913

The results show us that current age has a borderline significant positive effect on the ROC curve (p-value = 0.045). We now use rocregplot to draw the ROC curves for ages of 50 and 40 months, and add some graph options to make the legend pretty and place it inside the graph.

. rocregplot, at1(currage=40) at2(currage=50) legend(order(3 "reference" 1 "40 mos." 2 "50 mos.") 
     ring(0) rows(3) pos(5)) title("ROC, by age") xsize(4) ysize(4)

The graph indicates that the area under the curve (AUC) for 50 months is clearly larger than that for 40 months, and this can be formally verified by using testnl after rocreg; see [R] rocregplot for a related example.

Two other classifiers were examined in the study, y2 (TEOAE 80 at 2kHz) and y3 (ABR). We will use rocgold to compare the ROC areas of y2 and y3, assuming a “gold standard” classifier of y1 (DPOAE 65 at 2kHz). The sidak option provides adjusted p-values, reflecting the two tests that are being performed.

. rocgold d y1 y2 y3, sidak graph summary aspectratio(1)

ROC Sidak area Std. err. chi2 df Pr>chi2 Pr>chi2
y1 (standard) 0.6306 0.0240 y2 0.6006 0.0250 2.0759 1 0.1496 0.2769 y3 0.6081 0.0259 0.4931 1 0.4826 0.7323

We cannot reject the hypotheses that y2 and y3 have the same area as y1. Both the adjusted and unadjusted p-values support this.

Wieand et. al. (1989) examined a pancreatic cancer study. No covariates were recorded, and the study was a case–control study.

We use rocreg to estimate the ROC curve for the classifier y2 (CA 125) that was examined. A nonparametric estimate is used, and we bootstrap to obtain standard errors. We estimate the sensitivity for the specificity value of .6 through the roc() option, which takes argument 1-specificity. The partial area under the curve (pAUC), the area under the ROC curve up to a given 1-specificity value, is estimated for the specificity of .4 with the pauc() option. The case–control sampling of the study is indicated to rocreg via the bootcc option.

. use https://research.fredhutch.org/content/dam/stripe/diagnostic-biomarkers
     -statistical-center/files/wiedat2b.dta, clear
(S. Wieand - Pancreatic cancer diagnostic marker data)

. rocreg d y2, roc(.4) pauc(.6) bseed(8378923) bootcc nodots

Bootstrap results

Number of strata = 2                            Number of obs     =        141
                                                Replications      =      1,000

Nonparametric ROC estimation

Control standardization: empirical
ROC method             : empirical

ROC curve

   Status    : d
   Classifier: y2

Observed Bootstrap
ROC coefficient Bias std. err. [95% conf. interval]
.4 .7555556 -.0118111 .0767123 .6052022 .9059089 (N)
.5666667 .8666667 (P)
.5555556 .8555555 (BC)
Partial area under the ROC curve Status : d Classifier: y2
Observed Bootstrap
pAUC coefficient Bias std. err. [95% conf. interval]
.6 .3326797 .0033456 .0393666 .2555227 .4098368 (N)
.2583878 .4101961 (P)
.2419608 .3976471 (BC)

We can use rocregplot to see the ROC curve for y2 (CA 125). We also ask for normal-based confidence band for ROC value at the specificity of .6.

. rocregplot, plot1opts(msymbol(i)) legend(order(2 "reference" 1 "CA 125") 
     ring(0) rows(2) pos(5)) xsize(4) ysize(4) title("ROC, CA 125")


Norton, S. J., M. P. Gorga, J. E. Widen, R. C. Folsom, Y. Sininger B. Cone-Wesson, B. R. Vohr, K. Mascher, and K. Fletcher. 2000. Identification of neonatal hearing impairment: Evaluation of transient evoked otoacoustic emission, distortion product otoacoustic emission, and auditory brain stem response test performance. Ear and Hearing 21: 508–528.

Wieand, S., M. H. Gail, B. R. James, and K. L. James. 1989. A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data. Biometrika 76: 585–592.

Pepe, M. S. 2003. The Statistical Evaluation of Medical Tests for Classification and Prediction. New York: Oxford University Press.