Re: st: Comparing AUC for weighted samples in Stata?

From   Steve Samuels <>
Subject   Re: st: Comparing AUC for weighted samples in Stata?
Date   Fri, 13 Jan 2012 18:12:10 -0500

I suggest also: add the stratum variables as predictors in the original -logistic- models and, possibly, in -somersd- as well. This should help capture any reduction in standard errors attributable to the stratification. 


On Jan 13, 2012, at 6:02 AM, Roger B. Newson wrote:

The way to do this is to use -somersd- followed by -lincom-. As in:

somersd Y X1 X2 [pwei-sampwt], transf(c) tdist
lincom X1-X2

See Newson (2006) and Newson (2002) for examples of this.

If X1 and X2 are predictors from logistic regression models, then it is a good idea to estimate the Harrell's c statistics and their difference from a test set after fitting the model to a training set. See Newson (2010).

I hope this helps.

Best wishes



Newson RB. Comparing the predictive power of survival models using Harrell’s c or Somers’ D. The Stata Journal 2010; 10(3): 339–358. Purchase article from
or download pre-publication draft from

Newson R. Confidence intervals for rank statistics: Somers' D and extensions. The Stata Journal 2006; 6(3): 309-334. Download from

Newson R. Parameters behind "nonparametric" statistics: Kendall's tau, Somers' D and median differences. The Stata Journal 2002; 2(1): 45-64. Download from

Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Web page:
Departmental Web page:

Opinions expressed are those of the author, not of the institution.

On 13/01/2012 10:21, Stian Lydersen wrote:
> We have weighted samples from 4 strata of pre-school children. The
> strata sizes vary from 1095 to 194, and the sampling probabilities vary
> from 0.37 to 0.89, with highest sampling probabilities is the strata
> with high risk of diagnosis of a mental health problem. (The diagnosis,
> a gold standard, is established through an interview).
> We investigate if the sum score from one or more questionnaires can be
> used as diagnostic screening instrument.
> We have been able to compute ROC, and an estimate and CI for AUC using
> the suggestion on
> In addition we want to do the following:
> Compare the AUC for sum scores from two separate screeing instruments
> (on the same samples). Estimate, CI and p-value for difference?
> Added value: Assume we use of screening instrument 2 in addition to
> screening instrument 1, for example estiamte a logistic regression model
> with predictro 1 alone, and the predictor 1 and 2. How much added value
> (in terms of AUC estimate CI and p-value) does this give?
> Do you know where we can find info on how to do this, preferably in Stata?
> Regards, Stian Lydersen and Trude Hamre Sveen
> e-mail
