Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Clyde Schechter <clyde.schechter@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: Re: st: Sample size for four-level logistic regression |

Date |
Sun, 23 Jun 2013 18:09:37 -0700 |

Many thanks to Phil Schumm, Jeph Herrin, and William Buchanan for their suggestions in response to my need for a quick way to approximate a sample size calculation on a four-level logistic regression model. As it turns out, Phil Schumm's idea of using -glm- with cluster standard errors and comparing those to the standard error estimates I have from the -xtmelogit- results to guide the interpretation of a simulation based on -glm- is working out well for me. Each -glm- run is under a minute, often under 20 seconds, even with very large samples. When I re-ran the same datasets that I had run with -xtmelogit- using -glm- I did, indeed, find that there was a relatively constant proportion in the reported standard errors. So I have now done simulations of several of my most plausible scenarios using this method. And, for what it's worth, it looks like our project is, indeed, feasible--at least from this perspective. More scenarios are being simulated as I write this. I also tried Jeph Herrin's idea, but I found that a linear probability model produced effect estimates that were rather different from the underlying (logistic) model, so I wasn't comfortable going farther in that direction. I think a probability of 2.5 per 1,000 is just too low for this approach to work. I had thought that William Buchanan's idea of using the results from an -xtmelogit- run as starting values for more -xtmelogit- simulations would work well. But to my surprise, even with the added benefit from specifying -refineopts(iterate(0))- the speed-up was not all that great, a factor of 2 or so. Looking at the outputs in more detail, it seems that the slowness of the -xtmelogit- results is not so much from the number of iterations needed to converge (mostly under 6), but from the very long time needed to calculate the log-likelihood at each step, even the first iteration with starting values specified. So having a closer starting point didn't gain me all that much speed. I'm pleased to hear from Yulia Marchenko that the new -melogit- command will be 7-10 times faster. Even that speed-up wouldn't really let me do all the simulations I want in the available time, but it should prove gratifying to use version 13 for these analyses going forward. And I'm looking forward to the additional developments alluded to. Once again, I am awed by the excellent advice and help I get from Statalist. What a great community! Thanks again. Clyde Schechter Dept. of Family & Social Medicine Albert Einstein College of Medicine Bronx, NY, USA * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: Blanchard & Quah decomposition** - Next by Date:
**Re: st: How to transfer strings in different rows of a cell into different observations** - Previous by thread:
**Re: st: Sample size for four-level logistic regression** - Next by thread:
**st: predictions out of sample with spatial regression** - Index(es):