Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Michael Wahman <Michael.Wahman@svet.lu.se> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: Blown up Std. errors in logistic regression with bootstrap |

Date |
Mon, 13 Dec 2010 16:00:49 +0100 |

Dear Maarten and Stas,

Thanks a million for helping me to sought this problem out. /Michael 13 dec 2010 kl. 14.47 skrev Maarten buis:

--- On Mon, 13/12/10, Michael Wahman wrote:I am doing a study with two different logistic models, where n is fairly small. In one of the models n is somewhat bigger (n=107) and one model has a smalle r n (n=51). I want to use robust bootstrapped standard errors to compensate for the small n, especially in the second model. I've understood that it is problematic to use MLE when the number of d.f. s is small, since this model might not be asymptotic. I have experimented with bootstraps, but the standard errors in the model become huge. This seems to be associated with the models with a small number of df.s. If I run the models with a higher n with a bootstrap, I don’t get this problem. Neither do I get it when excluding the control variables.Sounds to me like the model is close to being perfectly determined. An example of perfect determination would be when you have a continuous variable x and all observations with values less than 2 on x are failures and all observations with values more than 2 are successes. When you get close to being perfectly determined small changes in the data can lead to huge changes in the parameters, which would yield the kind of behaviour you found when using -bootstrap-. First, I would use very few explanatory variables in that dataset. The best case scenario would be when the proportion of "successes" is about 50%. In that case the variance of the dependent variable is maximum and the data contains the most information. In that case I might use 4 explantory variables, maybe even 5. If the proportion of successes is less than 30% or more than 70% I would use 1 maybe 2 explanatory variables. Second, I would look at some cross tabulations of your explantory variables against your dependent variable, to see if you can find some problematic explanatory variables. Third, I would add the -saving()- option in -bootstrap-. This will save the estimates in each bootstrap sample. Looking at these might help you identify the problem. Fourth, if the problem is close to perfect determination then you might want to take a look at exact logistic regression (-help exlogistic-). Hope this helps, Maarten -------------------------- Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl -------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Blown up Std. errors in logistic regression with bootstrap***From:*Stas Kolenikov <skolenik@gmail.com>

**References**:**Re: st: Blown up Std. errors in logistic regression with bootstrap***From:*Maarten buis <maartenbuis@yahoo.co.uk>

- Prev by Date:
**Re: st: Margins after mi sqreg** - Next by Date:
**Re: st: RE: regress with vce(robust) and hascons** - Previous by thread:
**Re: st: Blown up Std. errors in logistic regression with bootstrap** - Next by thread:
**Re: st: Blown up Std. errors in logistic regression with bootstrap** - Index(es):