Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Clustered dataset question - Estimating Individual threshold


From   Joseph Coveney <[email protected]>
To   Statalist <[email protected]>
Subject   Re: st: Clustered dataset question - Estimating Individual threshold
Date   Tue, 14 Oct 2003 18:38:14 +0900

Dale Steele wrote:

-------------------begin excerpted post-------------------------------------

My question relates to a dataset from an experimental protocol completed by
66 subjects
<http://www.brown.edu/Administration/Emergency_Medicine/thresh.dta>.  It is
assumed that all subjects have the ability to detect an added resistance to
breathing, but that each has a different threshold.  We first measure each
subjects intrinsic resistance and present added resistances which are a
percentage of intrinsic. An added resistance to breathing (per_intr) is
presented which is approximately 0, 20, 40, 60, 80, 100, 120, 140, 160, 180
and 200 percent of  intrinsic resistance. In random order, non-zero stimuli
are presented three times and the zero level is presented six times for a
total of 36 trials per subject.

My goal is to estimate a threshold (and its variance) for each subject.

My first thought was to run 66 separate logistic regression models (as
below) and calculate a threshold as the resistance at which the predicted
probability of detection was 50% ( - (b_cons)/b_per_intr).  One problem is
that some subjects appear to have a very well defined threshold and the
logit models fails when response is  "completely determined".

I had trouble running the unconditional fixed-effects probit model
suggested.  I'd appreciate any guidance on how to run this model as well as
in how to use -gllamm- / - gllapred- to estimate the corresponding
random-effects estimator of each subject's threshold.

----------------------------end excerpted post------------------------------

That the added resistances are a percentage of the subject's intrinsic
resistance might induce a correlation between the fixed effect (per_intr)
and the random effect (latent variable, u_i).  I believe that a random
effects regression, such as in -xtlogit , re- or -gllamm-, relies on the
absence of such a correlation to provide consistent estimates.

In any event, the do-file below shows how to use -gllamm- / -gllapred- to
get each subject's threshold (50% probability resistance) value.  I don't
know how to get the standard error of the threshold value since it involves
both the fixed and random effects, although it ought to be doable using the
delta method--the variance of the latent variable estimate and of the fixed
effects coefficients are available and the random-fixed covariance is
supposed to be zero.

I also show the unconditional fixed effects logistic regression.  It will be
biased in principle, but with the ca. 36 trials for 66 subjects and the
relatively low correlation within subject, it appears that the bias is quite
small (or that estimates for both unconditional fixed effects and correlated
random effects in this case are equally inconsistent).  Regardless, the
individual threshold values for the unconditional fixed effects regression
agree much more closely with those predicted by -gllapred- than with those
predicted by individual-subject logistic regressions, although in many
cases, all three agree when present.

The slight bias for the unconditional fixed effects logistic regression is
shown in the graph of predicted thresholds versus per_intr), using the
random effects predictions as the gold standard.  I suspect that most of the
reason for the flatter slope of the prediction-predictor plot of the
unconditional fixed effects predictions versus that of the random effects
predictions is due to shrinkage--the predicted thresholds for outliers (high
values and low values) are "brought in" much more with the random effects
approach than with the fixed effects approach.

Note that the necessary diagnostic or goodness-of-fit tests haven't been run
on any of these regressions.

Joseph Coveney

----------------------------------------------------------------------------

clear
set more off
use http://www.brown.edu/Administration/Emergency_Medicine/thresh.dta
*
tabulate idnum response
// Some missing data for some people; one person (No. 416) never responded
*
* Random effects (intercepts) logistic regression
*
gllamm response per_intr, i(idnum) family(binomial) nip(30) adapt nolog
gllapred latvar, u
generate float res50 = - (_b[_cons] + latvarm1) / _b[per_intr]
slist idnum res50 if trial == 1, decimal(0) noobs
drop lat*
*
xtlogit response per_intr, i(idnum) re quad(30) nolog
/*
   Agrees well with adaptive quadrature algorithm's results. (sigma_u is
   sqrt(var(1)) above.)
   rho = 0.39; so, with T = 36, or so, unconditional fixed-effects logistic
   regression shouldn't be *too* biased . . .
*/
sort idnum
by idnum: generate byte id = _n == 1
replace id = sum(id)
*
* Unconditional fixed effects logistic regression
*
quietly tabulate idnum, generate(dum)
logit response per_intr dum2-dum66, nolog
generate float fes50 = .
generate float fes50_se = .
predictnl new_fes50 = - (_b[_cons]) / _b[per_intr] if id == 1, ///
  se(new_fes50_se)
replace fes50 = new_fes50 if id == 1
replace fes50_se = new_fes50_se if id == 1
drop new_fes50  new_fes50_se
forvalues i = 2/66 {
    capture predictnl new_fes50 = - (_b[_cons] + _b[dum`i']) / ///
      _b[per_intr] if id == `i', se(new_fes50_se)
    capture replace fes50 = new_fes50 if id == `i'
    capture replace fes50_se = new_fes50_se if id == `i'
    capture drop new_fes50 new_fes50_se
}
slist idnum res50 fes50 fes50_se if trial == 1, decimal(0) noobs
// Person 416's outcomes were completely determined
graph7 fes50 res50 res50, xlabel ylabel connect(.L) symbol(Oi)
/*
   There is evidence of some bias (steeper slope in the graph) in the
fixed-effects
   predictions, especially at the higher end of the range (probably due to
"shrinkage"
   in the random effects regression and perhaps exacerbated by the exclusion
of Person 416's data by
   the fixed effects regression).  But they are reasonably close for nearly
all cases.
*/
*
*
*
generate float lgt50 = .
generate float lgt50_se = .
forvalues i = 1/66 {
    capture logit response per_intr if id == `i'
    capture predictnl new_lgt50 = - _b[_cons] / ///
      _b[per_intr] if id == `i', se(new_lgt50_se)
    capture replace lgt50 = new_lgt50 if id == `i'
    capture replace lgt50_se = new_lgt50_se if id == `i'
    capture drop new_lgt50 new_lgt50_se
}
slist idnum res50 fes50 fes50_se lgt50 lgt50_se if trial == 1, ///
  decimal(0) noobs
exit

----------------------------------------------------------------------------





*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index