[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: weird ROC curve

From	Arnold Kester <[email protected]>
To	[email protected]
Subject	RE: st: RE: weird ROC curve
Date	09 Dec 2002 10:07:43 +0100

On Sat, 2002-12-07 at 03:37, Cleves, Mario A wrote:
> > Arnold Kester <mailto:[email protected]> wrote in response to my posting: 
> >  
> > > I don't agree, it's not my point whether the analysis is proper for
> > these data. Even when the data are 'pathological', which I doubt, the
> > ROC curve should be drawn as it is defined. And by definition, the curve
> > cannot be non-monotone. If the curve is drawn wrong in this case, how
> > can we be confident that it will be correct in other instances?
> 	I misunderstood Arnold's intent. I assumed that he was trying to make sense of his data and not trying to indicate that there was a bug in -roctab-. In respond to that, I used Arnold's data and recalculated the sensitivity and specificity at every possible cut-point and plotted the results. The plot I produced was identical to that from -roctab-, indicating that Arnold's results were not caused by a bug in -roctab-. The problem is that in Arnold's data, as Joseph Coveney <[email protected]> pointed out,  "the diagnostic test (slightly) predicts the diametric opposite of the truth" thus,  for Arnold's diagnostic test higher values are associated with lower risk, violating the assumption that "The rating or outcome of the diagnostic test must be at least ordinal with higher values indicating higher risk". 
> 	What is important is that the peculiar results that Arnold obtained were not cause by a bug in -roctab-, and thus, user can be confident that -roctab- will generate the correct results in all instances where the assumptions are met. 
> 	Here is the code used to calculate the sensitivity and specificity at every possible cut-point in the data.

Mario,

This seems correct, except I'd like to see a monotone line. This could
be obtained by inserting a modified sort before the plot as I've
indicated below.
And may I suggest a much faster calculation?

Regards,
Arnold Kester

======================
insheet using roc_data.out, clear

** add one obs with larger x-value and missing y 

sort x
local Np=_N+1
set obs `Np'
replace x=x[`Np'-1]+1 in l

** variabele for reverse sorting on x

gen minus=-x

** count positives and negatives for each x

gen neg=y==0
gen pos=y==1

** sort and aggregate for distinct x-values

collapse (sum) neg pos, by(minus)

** cumulate these to get number below each cutoff

gen cum0=sum(neg)
gen cum1=sum(pos)

** standardize and plot

gen sens=cum1/cum1[_N]
gen onesp=cum0/cum0[_N]
gra sens onesp, c(l) s(i)
==================================
  

> 	insheet using roc_data.out, names clear
> 	count if y==0
> 	local y0=r(N)
> 	count if y==1
> 	local y1=r(N)
> 	sort x
> 	gen sens=.
> 	gen spec=.
> 	local N=_N
> 	local i 1
> 	while `i'<=`N' {
> 		tempvar g
> 		gen `g'=0
> 		replace `g'=1 if x>x[`i']
> 		count if `g'==0 & y==0
> 		local sp=r(N)/`y0'
> 		count if `g'==1 & y==1
> 		local se=r(N)/`y1'
> 		replace spec=`sp' in `i'
> 		replace sens=`se' in `i'
> 		drop `g'
> 		local i= `i'+1
> 	noi di in red `i'
> 	}
> 	local nN=`N' +2
> 	set obs `nN'
> 	local pu=_N-1
> 	replace spec=0 in `pu'
> 	replace sens=1 in `pu'
> 	replace spec=1 in l
> 	replace sens=0 in l
> 	gen onesp=1- spec
> * omit this	sort  onesp
  gen sortvar=sens+onesp
  sort sortvar
> 	gr  sens  onesp, c(l) ylab xlab sort


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- RE: st: RE: weird ROC curve
  - From: "Cleves, Mario A" <[email protected]>

Prev by Date: st: lnskew question
Next by Date: st: RE: lnskew question
Previous by thread: RE: st: RE: weird ROC curve
Next by thread: st: time series
Index(es):
- Date
- Thread