[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Arnold Kester <arnold.kester@stat.unimaas.nl> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
RE: st: RE: weird ROC curve |

Date |
09 Dec 2002 10:07:43 +0100 |

On Sat, 2002-12-07 at 03:37, Cleves, Mario A wrote: > > Arnold Kester <mailto:arnold.kester@stat.unimaas.nl> wrote in response to my posting: > > > > > I don't agree, it's not my point whether the analysis is proper for > > these data. Even when the data are 'pathological', which I doubt, the > > ROC curve should be drawn as it is defined. And by definition, the curve > > cannot be non-monotone. If the curve is drawn wrong in this case, how > > can we be confident that it will be correct in other instances? > I misunderstood Arnold's intent. I assumed that he was trying to make sense of his data and not trying to indicate that there was a bug in -roctab-. In respond to that, I used Arnold's data and recalculated the sensitivity and specificity at every possible cut-point and plotted the results. The plot I produced was identical to that from -roctab-, indicating that Arnold's results were not caused by a bug in -roctab-. The problem is that in Arnold's data, as Joseph Coveney <jcoveney@bigplanet.com> pointed out, "the diagnostic test (slightly) predicts the diametric opposite of the truth" thus, for Arnold's diagnostic test higher values are associated with lower risk, violating the assumption that "The rating or outcome of the diagnostic test must be at least ordinal with higher values indicating higher risk". > What is important is that the peculiar results that Arnold obtained were not cause by a bug in -roctab-, and thus, user can be confident that -roctab- will generate the correct results in all instances where the assumptions are met. > Here is the code used to calculate the sensitivity and specificity at every possible cut-point in the data. Mario, This seems correct, except I'd like to see a monotone line. This could be obtained by inserting a modified sort before the plot as I've indicated below. And may I suggest a much faster calculation? Regards, Arnold Kester ====================== insheet using roc_data.out, clear ** add one obs with larger x-value and missing y sort x local Np=_N+1 set obs `Np' replace x=x[`Np'-1]+1 in l ** variabele for reverse sorting on x gen minus=-x ** count positives and negatives for each x gen neg=y==0 gen pos=y==1 ** sort and aggregate for distinct x-values collapse (sum) neg pos, by(minus) ** cumulate these to get number below each cutoff gen cum0=sum(neg) gen cum1=sum(pos) ** standardize and plot gen sens=cum1/cum1[_N] gen onesp=cum0/cum0[_N] gra sens onesp, c(l) s(i) ================================== > insheet using roc_data.out, names clear > count if y==0 > local y0=r(N) > count if y==1 > local y1=r(N) > sort x > gen sens=. > gen spec=. > local N=_N > local i 1 > while `i'<=`N' { > tempvar g > gen `g'=0 > replace `g'=1 if x>x[`i'] > count if `g'==0 & y==0 > local sp=r(N)/`y0' > count if `g'==1 & y==1 > local se=r(N)/`y1' > replace spec=`sp' in `i' > replace sens=`se' in `i' > drop `g' > local i= `i'+1 > noi di in red `i' > } > local nN=`N' +2 > set obs `nN' > local pu=_N-1 > replace spec=0 in `pu' > replace sens=1 in `pu' > replace spec=1 in l > replace sens=0 in l > gen onesp=1- spec > * omit this sort onesp gen sortvar=sens+onesp sort sortvar > gr sens onesp, c(l) ylab xlab sort * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**RE: st: RE: weird ROC curve***From:*"Cleves, Mario A" <ClevesMarioA@uams.edu>

- Prev by Date:
**st: lnskew question** - Next by Date:
**st: RE: lnskew question** - Previous by thread:
**RE: st: RE: weird ROC curve** - Next by thread:
**st: time series** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |