# st: RE: Does my statistic for "net proportion of subjects with improved prediction" already exist?

 From "Daniel Waxman" To Subject st: RE: Does my statistic for "net proportion of subjects with improved prediction" already exist? Date Thu, 27 Sep 2007 17:53:22 -0400

```representing 1 perfect reassignment of risk (new model assigns a higher
predicted probability to all those who have events and a lower probability
to all those who do not), to -1 for perfect mis-reassignment (the converse).

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Daniel Waxman
Sent: Thursday, September 27, 2007 12:31 PM
To: statalist@hsphsun2.harvard.edu
Subject: st: Does my statistic for "net proportion of subjects with improved

Statalist,

I am studying the effect of adding a biomarker to an existing model and want
to describe the effect of that model vis-à-vis the number of subjects with
improved predictions in the “new model” vs. the “old model”.  While there is
an extensive literature on this topic, most of it divides the outcome into
risk categories (i.e. predicted risk of 0-5%, 5-10%, etc.), something that I
am not so interested in doing.

An intuitive way to look at this would be to look at the net number of
subjects who are assigned a higher predicted probability with the new model
among those with the outcome in question, plus the net number assigned a
lower probability among those who did not have the outcome.  The ratio of
this number to the total # of subjects would then be the proportion of
patients with improved predictions (and would range from zero to 1).  See
example below.

My question:  Did I just reinvent the wheel?  (e.g. is this equivalent to
some existing statistic?)  Does anybody see any logical problem with looking
at this as one measure of the effect of adding a predictor to an existing
model?

Thanks,
Daniel Waxman

**** example: (where zlog is continuous, zero is dichotomous, new_marker is
the dichotomous new marker, and there is no missing data) ***

. logistic outcome zlog zero
. predict p_old

. logistic outcome zlog zero new_marker
. predict p_new

. count if e(sample)
. gen N=r(N)

. egen number_up_outcome=total(p_new>p_old & outcome)
. egen number_down_outcome=total(p_new<p_old & outcome)

. egen number_up_no_outcome=total(p_new>p_old & !outcome)
. egen number_down_no_outcome=total(p_new<p_old & !outcome)

. gen net_proportion_improved=
((number_up_outcome-number_down_outcome)+(number_down_no_outcome-number_up_n
o_outcome))/N

No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.488 / Virus Database: 269.13.32/1033 - Release Date: 9/27/2007
11:06 AM

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.488 / Virus Database: 269.13.32/1033 - Release Date: 9/27/2007
11:06 AM

No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.488 / Virus Database: 269.13.32/1033 - Release Date: 9/27/2007
11:06 AM

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```