[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Airey, David C" <david.airey@Vanderbilt.Edu> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
re: st: weights in xtlogit |

Date |
Sat, 14 Nov 2009 21:58:12 -0600 |

. Thanks Austin. I was trying to find ways of speeding up power simulations of xtlogit models where cluster size is large. If I have 20-30 clusters in control and treatment groups, where each cluster has upwards of 1000 or more 1/0 observations, then xtlogit can be slow. So variables might be clusterid (animal = cluster), yesno (1 or 0), and treatment (1 or 0). In 2008 after a similar query, Joseph Coveney posted a version of an xtlogit simulation in a situation where cluster number was small (12) and the cluster size was also not large (50). A slightly modified version of that code is below. Unfortunately or fortunately, now we have deep (aka next gen) sequencing data which is a technology that gives very larger cluster sizes in our application. One of our colleagues is looking at modeling this data with an overdispersed count model instead of a mixed logit. That might be a better solution. // notes follow "//" or "*" // modified from Coveney, 2008 clear all set more off // allow results to scroll set seed `=date("2009-11-09", "YMD")' // for repeatable simulations * capture program drop edits // drop program from memory if error program define edits, rclass // program to be called in simulation version 11 // version 11 of Stata syntax [, delta(real 0) rho(real 0)] // input delta and rho drop _all tempname p_t // define a temporary variable for later clear set obs 12 // 12 mice...could be 50 now generate int pid = _n // integer pid numbered from 1 to 12 generate byte trt = mod(_n, 2) // alternate 0 and 1 // Generate a probability // using the cluster correlation rho and the standard // deviation of the logistic distribution, along with // the treatment effect and some normal error per animal. generate double prob = invlogit(trt * `delta' + /// sqrt(`rho' / (1 - `rho') * _pi^2 / 3) * rnormal()) assert !missing(prob) // if rho is misspecified generate int nseq = 100 // number of sequences per animal // now the number of sequences per animal could be 1000 or more gen bnl = rbinomial(nseq, prob) // binomial for (100, prob) * // ttest on percent summary statistic per animal gen percent = bnl/nseq*100 ttest percent, by(trt) scalar define `p_t' = r(p) * // reshape the data from one score per animal // to one binary observation per row // so that a mixed logit model can be used... rename bnl count1 generate int count0 = nseq - count1 reshape long count, i(pid) j(positive) // can weights in xtlogit be used? drop if !count expand count xtlogit positive trt, i(pid) intpoints(30) // return actual rho and delta and // xtlogit and ttest pvalues for summarization return scalar rho = e(rho) return scalar p_r = chi2tail(1, e(chi2)) return scalar delta = _b[trt] return scalar p_t = `p_t' end // program end * // call program above for treatment effects 0, 0.5, and 1 // rho values of 0, 0.05, 0.10, and 0.15 forvalues delta_int = 0(5)10 { local delta = `delta_int'/10 // delta = 0.0, 0.5, and 1.0 forvalues rho_int = 0(5)15 { local rho = `rho_int'/100 // rho = 0.0, 0.05, 0.10, 0.10 display in smcl as text _newline(1) /// "delta = " %03.2f `delta' " rho = " %03.2f `rho' simulate delta=r(delta) rho=r(rho) p_r=r(p_r) /// p_t=r(p_t), reps(300): /// edits , rho(`rho') delta(`delta') summarize delta rho foreach var of varlist p_* { generate byte pos_`var' = `var' < 0.05 } summarize pos_* } } exit -Dave > Does > "two groups of clustered observations, but lots of observations per cluster" > mean id takes on two distinct values? And you have many obs in each > category? Just use logit with a dummy variable for id in that case. > > On Sat, Nov 14, 2009 at 2:16 PM, Airey, David C > <david.airey@vanderbilt.edu> wrote: > > . > > > > In logit models can one get faster results using weights if there are few covariate patterns? I thought I remembered reading that somewhere. If so, is it possible to do the same with xtlogit? So if you have the simplest of model, where you have two groups of clustered observations, but lots of observations per cluster, making estimation slow, can weights be used to speed up the estimation? > > > > xtlogit yn trt, i(id) * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: weights in xtlogit***From:*Austin Nichols <austinnichols@gmail.com>

- Prev by Date:
**Re: st: 2STAGE ESTIMATION WITH TOBIT AND POISSON** - Next by Date:
**st: How do I suppress output in mata?** - Previous by thread:
**re: st: weights in xtlogit** - Next by thread:
**Re: st: weights in xtlogit** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |