Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re: st: Sample size for four-level logistic regression

From	Clyde Schechter <[email protected]>
To	[email protected]
Subject	Re: Re: st: Sample size for four-level logistic regression
Date	Sun, 23 Jun 2013 18:09:37 -0700

Many thanks to Phil Schumm, Jeph Herrin, and William Buchanan for
their suggestions in response to my need for a quick way to
approximate a sample size calculation on a four-level logistic
regression model.

As it turns out, Phil Schumm's idea of using -glm- with cluster
standard errors and comparing those to the standard error estimates I
have from the -xtmelogit- results to guide the interpretation of a
simulation based on -glm- is working out well for me.  Each -glm- run
is under a minute, often under 20 seconds, even with very large
samples.  When I re-ran the same datasets that I had run with
-xtmelogit- using -glm- I did, indeed, find that there was a
relatively constant proportion in the reported standard errors.  So I
have now done simulations of several of my most plausible scenarios
using this method.  And, for what it's worth, it looks like our
project is, indeed, feasible--at least from this perspective.  More
scenarios are being simulated as I write this.

I also tried Jeph Herrin's idea, but I found that a linear probability
model produced effect estimates that were rather different from the
underlying (logistic) model, so I wasn't comfortable going farther in
that direction.  I think a probability of 2.5 per 1,000 is just too
low for this approach to work.

I had thought that William Buchanan's idea of using the results from
an -xtmelogit- run as starting values for more -xtmelogit- simulations
would work well.  But to my surprise, even with the added benefit from
specifying -refineopts(iterate(0))- the speed-up was not all that
great, a factor of 2 or so.  Looking at the outputs in more detail, it
seems that the slowness of the -xtmelogit- results is not so much from
the number of iterations needed to converge (mostly under 6), but from
the very long time needed to calculate the log-likelihood at each
step, even the first iteration with starting values specified.  So
having a closer starting point didn't gain me all that much speed.

I'm pleased to hear from Yulia Marchenko that the new -melogit-
command will be 7-10 times faster.  Even that speed-up wouldn't really
let me do all the simulations I want in the available time, but it
should prove gratifying to use version 13 for these analyses going
forward.  And I'm looking forward to the additional developments
alluded to.

Once again, I am awed by the excellent advice and help I get from
Statalist.  What a great community!

Thanks again.

Clyde Schechter
Dept. of Family & Social Medicine
Albert Einstein College of Medicine
Bronx, NY, USA
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: Blanchard & Quah decomposition
Next by Date: Re: st: How to transfer strings in different rows of a cell into different observations
Previous by thread: Re: st: Sample size for four-level logistic regression
Next by thread: st: predictions out of sample with spatial regression
Index(es):
- Date
- Thread