Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Maarten buis <maartenbuis@yahoo.co.uk> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Blown up Std. errors in logistic regression with bootstrap |

Date |
Mon, 13 Dec 2010 13:47:27 +0000 (GMT) |

--- On Mon, 13/12/10, Michael Wahman wrote: > I am doing a study with two different logistic models, > where n is fairly small. In one of the models n is somewhat > bigger (n=107) and one model has a smalle r n (n=51). I want > to use robust bootstrapped standard errors to compensate for > the small n, especially in the second model. I've understood > that it is problematic to use MLE when the number of d.f. s > is small, since this model might not be asymptotic. > > I have experimented with bootstraps, but the standard > errors in the model become huge. This seems to be associated > with the models with a small number of df.s. If I run the > models with a higher n with a bootstrap, I don’t get this > problem. Neither do I get it when excluding the control > variables. Sounds to me like the model is close to being perfectly determined. An example of perfect determination would be when you have a continuous variable x and all observations with values less than 2 on x are failures and all observations with values more than 2 are successes. When you get close to being perfectly determined small changes in the data can lead to huge changes in the parameters, which would yield the kind of behaviour you found when using -bootstrap-. First, I would use very few explanatory variables in that dataset. The best case scenario would be when the proportion of "successes" is about 50%. In that case the variance of the dependent variable is maximum and the data contains the most information. In that case I might use 4 explantory variables, maybe even 5. If the proportion of successes is less than 30% or more than 70% I would use 1 maybe 2 explanatory variables. Second, I would look at some cross tabulations of your explantory variables against your dependent variable, to see if you can find some problematic explanatory variables. Third, I would add the -saving()- option in -bootstrap-. This will save the estimates in each bootstrap sample. Looking at these might help you identify the problem. Fourth, if the problem is close to perfect determination then you might want to take a look at exact logistic regression (-help exlogistic-). Hope this helps, Maarten -------------------------- Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl -------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Blown up Std. errors in logistic regression with bootstrap***From:*Michael Wahman <Michael.Wahman@svet.lu.se>

**References**:**st: Blown up Std. errors in logistic regression with bootstrap***From:*Michael Wahman <Michael.Wahman@svet.lu.se>

- Prev by Date:
**Re: st: regress with vce(robust) and hascons** - Next by Date:
**st: RE: regress with vce(robust) and hascons** - Previous by thread:
**Re: st: Blown up Std. errors in logistic regression with bootstrap** - Next by thread:
**Re: st: Blown up Std. errors in logistic regression with bootstrap** - Index(es):