[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Joseph Coveney" <jcoveney@bigplanet.com> |

To |
"Statalist" <statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: ttest or xtmelogit? |

Date |
Fri, 14 Mar 2008 11:57:09 +0900 |

Jay Verkuilen wrote (excerpted): David Airey wrote:

I have a typical pilot data set.<

A few things: (1) Why not try xtmelogit? Can't do any harm to see what it does. (2) Resampling methods seem tailor-made for this kind of problem. Stata's got really impressive support for clustering and stratification for both jackknifing and bootstrapping. (3) On a problem like this, I'd give serious consideration to a fully Bayesian model. -------------------------------------------------------------------------------- Although resampling methods are very useful in other contexts, I've never had much success with bootstrapping or jackknifing to overcome the inflation of Type I error rates with mixed models applied to small samples. That's why I suggested simulating the null distribution of the asymptotic test statistic empirically. The do-file and its log below illustrates both of these points. Bootstrapping with David's study parameters (six per group, 50 observations each, rho about 0.05) leaves the Type I error rate at about what Dave found without bootstrapping, viz., double the 5% nominal. Empirically generating the cumulative distribution of the likelihood-ratio chi2 under these conditions, on the other hand, keeps the test size from inflating (actually a little conservative). I suppose that this could also be done using randomization (permutation) testing, but identifying exchangeable observations might become problematic in a mixed model; you might be better off going with a stratified nonparametric test, such as van Elteren's, or just using a t test on a summary measure. Note that -xtlogit- shows that we were successful at hitting the target rho in these cases (up through at least a rho of 35%; it drops off at a target of 70%). I've had less success attempting simulating with linear mixed models, mainly because of the rather large and unpredictable downward bias in estimates of variance components with unbalanced small samples, even using REML: getting in the neighborhood of the variance components is a hit-or-miss affair. Joseph Coveney -------------------------------------------------------------------------------- . * . * Resampling method . * [log redacted for brevity] . summarize delta rho Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- delta | 300 .0199082 .286092 -.7870363 .8587052 rho | 300 .0348602 .0258169 2.03e-11 .134113 . foreach var of varlist p_* { 2. generate byte pos_`var' = `var' < 0.05 3. } . summarize pos_* Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- pos_p_r | 300 .09 .2866599 0 1 pos_p_t | 300 .0533333 .2250728 0 1 . * . * External empirical reference distribution method . * [log redacted for brevity] . display in smcl as result `tmpnam0' / r(N) .0200148 -------------------------------------------------------------------------------- clear * set more off set seed `=date("2008-03-13", "YMD")' set seed0 `=date("2008-03-13", "YMD")' * capture program drop simem program define simem, rclass version 10 syntax [, Delta(real 0) Rho(real 0) VCE(passthru)] drop _all tempname p_t p_l set obs 12 generate int pid = _n generate byte trt = mod(_n, 2) generate double mu = invlogit(trt * `delta' + /// sqrt(`rho' / (1 - `rho') * _pi^2 / 3) * invnormal(uniform())) assert !missing(mu) // if rho is misspecified generate byte den = 50 rndbinx mu den * /* generate double pos_prim = bnlx replace pos_prim = 50 - 1 / 2 / 50 if pos_prim == 50 replace pos_prim = 1 / 2 / 50 if !pos_prim generate double logit_pi = logit(pos_prim / den) */ ttest bnlx, by(trt) scalar define `p_t' = r(p) /* ttest logit_pi, by(trt) scalar define `p_l' = r(p) */ * rename bnlx count1 generate byte count0 = den - count1 reshape long count, i(pid) j(positive) drop if !count expand count xtlogit positive trt, i(pid) intpoints(30) `vce' return scalar rho = e(rho) return scalar p_r = 2 * normal(-abs(_b[trt] / _se[trt])) return scalar chi2 = e(chi2) return scalar delta = _b[trt] return scalar p_t = `p_t' * return scalar p_l = `p_l' end * capture log close log using btestm.smcl, smcl replace * * Resampling method * quietly simulate delta = r(delta) rho = r(rho) p_r = r(p_r) /// p_t = r(p_t) chi2 = e(chi2), reps(300): /// simem , rho(0.05) vce(bootstrap) summarize delta rho foreach var of varlist p_* { generate byte pos_`var' = `var' < 0.05 } summarize pos_* * * External empirical reference distribution method * keep chi2 tempfile tmpfil0 tempname tmpnam0 tmpnam1 quietly save `tmpfil0' quietly simulate reference_chi2 = e(chi2), reps(300): simem , /// rho(0.05) quietly centile reference_chi2, centile(95) scalar define `tmpnam0' = r(c_1) quietly use `tmpfil0', clear quietly count if chi2 >= `tmpnam0' scalar define `tmpnam1' = r(N) quietly count display in smcl as result `tmpnam0' / r(N) log close exit * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: RE: RE: ttest or xtmelogit?***From:*"Verkuilen, Jay" <JVerkuilen@gc.cuny.edu>

- Prev by Date:
**st: imputation** - Next by Date:
**st: RE: RE: ttest or xtmelogit?** - Previous by thread:
**st: population weigted data and limited dependent variable first stageequation** - Next by thread:
**st: RE: RE: ttest or xtmelogit?** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |