Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: sensitivity and specificity after xtgee


From   "Joseph Coveney" <jcoveney@bigplanet.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Re: sensitivity and specificity after xtgee
Date   Mon, 17 Sep 2012 14:33:51 +0900

Mohammadreza Mohebbi wrote:

Is there any post-estimation option after running xtgee/ xtlogit (for
binary longitudinal regression) for calculating
sensitivity, specificity, positive and negative predictive values,
positive and negative likelihood ratios and ROC curve?

I used xtgee to fit binary regression for wheez syndrome with 3
measurements (baseline, first year and second year follow-ups) with
PD20 as exposure and  broadcat and smoker_current as covariates:

xtset  h_id h_int_no
xtgee    wheez_symp  pd_20_base i.broadcat smoker_current ,
corr(exchangeable) fam(binom) link(log/logit) eform robust

--------------------------------------------------------------------------------

-findit sensitivity- doesn't bring up anything mentioning -xt- commands except
for a couple of user-written commands (-metandi-, -midas-) that are intended to
be used in the context of meta-analysis of a collection of diagnostic
effectiveness studies.  References cited in their help files might give you some
insight into computing a summary ROC AUCs and the like after fitting
hierarchical logistic models.

In your case, though, couldn't you use -predict , mu- and go from there?
Something like that below?  Because "broadcat" and "smoker_current" are
time-varying, you'd need to compute sensitivity etc. at each time point.

Joseph Coveney

version 11.2

clear *
set more off

drawnorm u pd20, mean(0 180) corr(1 -0.5 \ -0.5 1) ///
    sd(1 180) seed(`=date("2012-09-17", "YMD")') n(300) // !
generate int patient_nr = _n
local a 1
local b 4
forvalues visit = 1/3 {
    generate byte broad_category`visit' = ///
        floor((`b' - `a' + 1) * runiform() + `a')
    generate byte smokes`visit' = floor(2 * runiform())
    generate double xb`visit' = ///
        0.0001 * pd20 + ///
        0.375 * (2.5 - broad_category`visit') + ///
        0.375 * (smokes`visit' - 0.5) + ///
        1 * (`visit' - 2 ) + /// !
        u + rnormal()
}
quietly reshape long broad_category smokes xb, i(patient_nr) j(visit)
label define Visits 1 Baseline 2 "First year" 3 "Second year"
label values visit Visits
set seed0 `=date("2012-09-17", "YMD")'
genbinomial wheezes, xbeta(xb) // -findit genbinomial- to install
preserve

*
* A little more difficult
*
xtgee wheezes c.pd20 i.(broad_category smokes), ///
    i(patient_nr) family(binomial) link(logit) eform ///
    corr(independent) robust nolog
predict double pr, mu

// AUCs of ROC curves
tempname B
matrix define `B' = e(b)
forvalues visit = 1/3 {
    display in smcl as result _newline(2) "`: label (visit) `visit''"
    lroc wheezes if visit == `visit', beta(`B') nograph
}

// The rest
tempname my_favorite_threshold
scalar define `my_favorite_threshold' = 0.5
generate byte TN = pr < `my_favorite_threshold' & !wheezes
generate byte FN = pr < `my_favorite_threshold' & wheezes
generate byte TP = pr >= `my_favorite_threshold' & wheezes
generate byte FP = pr >= `my_favorite_threshold' & !wheezes
assert (TN + FN + TP + FP) == 1
collapse (sum) TN = TN FN = FN TP = TP FP = FP, by(visit)
generate double sensitivity = TP / (TP + FN)
generate double specificity = TN / (TN + FP)
generate double likelihood_ratio_positive = sensitivity / (1 - specificity)
generate double likelihood_ratio_negative = (1 - sensitivity) / specificity
generate double positive_predictive_value = TP / (TP + FP)
generate double negative_predictive_value = TN / (TN + FN)

format sensitivity-negative_predictive_value %04.2f
local linesize `c(linesize)'
set linesize 200
list, noobs abbreviate(30)
set linesize `linesize'

*
* A little easier
*
restore
logistic wheezes c.pd20 i.(broad_category smokes), ///
    cluster(patient_nr) nolog

forvalues visit = 1/3 {
    display in smcl as result _newline(2) "`: label (visit) `visit''"
    lroc wheezes if visit == `visit', nograph
    estat classification if visit == `visit', ///
        cutoff(`=`my_favorite_threshold'')
}

exit

Prof. Gary Koch once mentioned in a seminar I attended years ago that if you're
going to specify robust anyway ("empirical", I think it's called in SAS's PROC
GENMOD, which is what he uses) in order to cover your backside for misspecified
covariance structure, then you might as well specify independence for it.  It's
least likely to give rise to convergence problems.

Assuming that he wasn't just playing devil's advocate for the sake of
stimulating discussion (covariance structure specification does affect
regression coefficients at least a little, after all), if you're going to fit a
marginal model, then the easiest way for you would be to use -logistic- or
-logit- (with the -cluster()- option, not that it makes any difference to point
estimates), and take advantage of the postestimation commands Stata has for
them.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index