Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
William Buchanan <william@williambuchanan.net> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Sample size for four-level logistic regression |

Date |
Fri, 21 Jun 2013 12:20:32 -0700 |

Since you've already took the time to find out how long would be needed for the model to converge, did you try using those estimates as starting values to see if that could help ease some of the pain for doing a simulation study? I would assume that even with some sampling variance the starting values from the model that converged in the first place would be relatively decent to help speed things up in the other models. Just a thought that I figured might potentially be helpful. HTH, Billy On Jun 20, 2013, at 6:17 PM, Clyde Schechter <clyde.schechter@gmail.com> wrote: > I'm working with some colleagues to try to get a sense of the feasibility > of an idea we have for a study. > > Our intervention would be randomized at the level of institutions, which > have a few levels of outcome-relevant internal hierarchy themselves. The > outcome is dichotomous and is fairly rare: around 2.5 "successes" per 1,000 > observations. (Observations within institutions will be relatively > plentiful and inexpensive to obtain electronically, although limited by the > number of discharges per year they handle. The limit on feasibility will > be the number of institutions, each of which will need resources to > implement the intervention and program their data collection.) Ultimately, > the analysis will require a 4-level logistic regression. > > I need to get a sense of how many institutions would need to be recruited > for the study: if too large, it's a dead letter. > > Were this a two- or three-level design with a continuous outcome, I would > use Optimal Design software. Alternatively, for two level designs there > are simple approximation formulas relating the simple random sample size to > the size needed based on a design effect calculated from the number of > observations per higher-level unit and the intraclass correlation. > > But I am unaware of any software that does sample size calculations for > four-level designs with dichotomous outcomes, and I have not found any > references providing any quick formulas for a design-effect correction. > > Plan A was to do simulations. The problem is that in the simulations, each > replication (analysis of a single simulated sample) takes 2 hours to run on > my setup, even with the Laplace approximation. For even one candidate > number of institutions and set of assumptions about variance components, I > will need about 500 replications to get reasonable precision on the power. > So we're talking months here. And I was hoping to try several combinations > of assumed number of institutions and variance components. Clearly a > non-starter. > > I thought about treating the three top levels as if they were a single > level, in effect, ignoring the nesting that takes place within institutions > and doing a 2-level analysis. That could be simulated quickly, but I have > no idea whether results for that would even vaguely resemble what is needed > for a four-level model. I also considered a linear probability model based > on the much faster -xtmixed-, but given the very low event rate I doubt > such an approach would be reasonable. > > By any chance, will the expanded sample size calculations supported in > Stata 13 handle this? Or is its speedup in runtime for xtmelogit so great > that it will deliver me from this problem? Stata 13 will be in my hands > before I can finish 500 reps of the simulation. > > Anyone have any suggestions for a plan B? A fully polished analysis ready > for submission to a granting agency is not needed at this time, but I need > enough information to know if this study idea is even worth pursuing. > > Any help will be appreciated. > > Clyde Schechter > Dept. of Family & Social Medicine > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Sample size for four-level logistic regression***From:*Clyde Schechter <clyde.schechter@gmail.com>

- Prev by Date:
**st: Retrieving variable list from describe** - Next by Date:
**Re: st: Retrieving variable list from describe** - Previous by thread:
**Re: st: Sample size for four-level logistic regression** - Next by thread:
**Re: st: Sample size for four-level logistic regression** - Index(es):