Allison says, in part, "The problem is not specifically the rarity of
events, but rather the possibility of a small number of cases on the
rarer of the two outcomes. If you have a sample size of 1000 but
only 20 events, you have a problem. If you have a sample size of
10,000 with 200 events, you may be OK. If your sample has 100,000
cases with 2000 events, you're golden."
The Stata version hasn't been updated in ages. Perhaps there are
better options now. Based on Allison's blog, perhaps the -firthlogit-
command available from SSC would be another option.
At 06:20 PM 4/11/2013, Sheila Vakharia wrote:
Hello Stata Users,
It had been recommended to me that I conduct Rare Events Logistic
Regression in order to confirm the outcome of my standard Logistic
Regression because I had a rare event (5%).
What I have found is that all of the coefficients from Logistic
Regression and the Rare Events Logistic Regression are exactly the
same. However, everything else has changed. Due to the robust standard
error in the ReLogit, my 95% confidence intervals are much larger now
and many variables are no longer statistically significant. Prior to
this, ALL of my variables were statistically significant.
I have documented all of the results in my write-up and they are all
illustrated in tables. However, my question is: Which results do I
really give weight to? Do I stand by my preliminary Logit outcomes? Do
I give priority to the ReLogit outcomes?
Thank you for your consideration,
Sheila
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/