Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Phil Schumm <pschumm@uchicago.edu> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: Sample size for four-level logistic regression |

Date |
Fri, 21 Jun 2013 12:20:19 -0500 |

On Jun 20, 2013, at 8:17 PM, Clyde Schechter <clyde.schechter@gmail.com> wrote: > Our intervention would be randomized at the level of institutions, which have a few levels of outcome-relevant internal hierarchy themselves. The outcome is dichotomous and is fairly rare: around 2.5 "successes" per 1,000 observations. (Observations within institutions will be relatively plentiful and inexpensive to obtain electronically, although limited by the number of discharges per year they handle. The limit on feasibility will be the number of institutions, each of which will need resources to implement the intervention and program their data collection.) Ultimately, the analysis will require a 4-level logistic regression. <snip> > But I am unaware of any software that does sample size calculations for four-level designs with dichotomous outcomes, and I have not found any references providing any quick formulas for a design-effect correction. <snip> > A fully polished analysis ready for submission to a granting agency is not needed at this time, but I need enough information to know if this study idea is even worth pursuing. If it were me, I would start by looking at a logistic regression of the proportion of "successes" for each institution on an indicator for the intervention (i.e., in which the institutions are the level of the analysis, the outcome is the proportion of successes, and there is a single binary covariate corresponding to the intervention). To the extent that you expect correlation within an institution, you would need to account for this either by using a single dispersion parameter (i.e., using -glm-) or by using the robust variance estimate (i.e., using -blogit-). If you do the former, then you can do the calculation analytically (i.e., for a given within-institution correlation, just plug in the corresponding value of the over-dispersion parameter), while if you do the latter, a simple simulation would probably be easiest (e.g., add an institution-level random effect when generating the observed proportions). If your power for detecting the intervention is miserable withi! n the context of this simple model, then that should be a pretty good indication that you need to rethink. Even if you expect to increase your precision by modeling some of the within-institution factors, I wouldn't want to count on that unless I had *very good* information about how they are distributed and how they are related to the outcome.

**Follow-Ups**:**Re: st: Sample size for four-level logistic regression***From:*Phil Schumm <pschumm@uchicago.edu>

**References**:**st: Sample size for four-level logistic regression***From:*Clyde Schechter <clyde.schechter@gmail.com>

- Prev by Date:
**Re: st: error codes returned by NL function evaluator program to use dummy variables** - Next by Date:
**Re: st: Sample size for four-level logistic regression** - Previous by thread:
**st: Sample size for four-level logistic regression** - Next by thread:
**Re: st: Sample size for four-level logistic regression** - Index(es):