Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Blown up Std. errors in logistic regression with bootstrap


From   Stas Kolenikov <skolenik@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Blown up Std. errors in logistic regression with bootstrap
Date   Mon, 13 Dec 2010 10:55:51 -0500

A cheat code to get out of that is to bootstrap independently the
positive and zero cases with -strata()- option. This will make your
results conditional on the total # of outcomes, somewhat in the spirit
of -clogit- models. Of course getting more cases would be a much
better solution, as essentially you will underestimate your
variability by the contribution of the variability in the # of
positive outcomes. Think about the total variance formula with some
conditional expectations:

Var[ beta-hat ] = E[ Var[ beta-hat | # of 1's] ] + Var[ E[ beta-hat| # of 1's ]

The bootstrap with fixed # of 1's estimates only the inside variance
of the first term. If you are nearing the perfect prediction, the
contribution of the second term will arguably be a small component.
You can assess their relative magnitudes via

predict prob, prob
gen var_coutcome = prob*(1-prob)
sum var_coutcome prob

The variance of -prob- will give the relative contribution of the
first term, and mean of -var_coutcome-, the relative contribution of
the second term. This derivation assumes that the information
contained in the positive outcomes is spread equally among betas,
which of course is a pretty strong assumption.


On Mon, Dec 13, 2010 at 10:00 AM, Michael Wahman
<Michael.Wahman@svet.lu.se> wrote:
> Thank you very much for your excellent advice! I really appreciate it. As
> you suggested, I used the noisily option to diagnose the problem. It turned
> out to be exactly the way you suspected. Most of the models were close to
> being perfectly determined.

-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index