From
Christopher Baum <kit.baum@bc.edu>

To
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>

Subject
Re: Re: st: Regression with about 5000 (dummy) variables

Date
Fri, 20 Apr 2012 09:33:57 -0400

<> On Apr 20, 2012, at 2:33 AM, John wrote: > The estimator still seems good. Notice, though, that the F-test > numerator DFs are only 3. So that's what I meant when I said we save on > DF (as compared to the OLS fixed-effects estimator). You're cheating, John, by overlooking that theorem about the absence of a free lunch. The number of DF is not 3. When you gave the commands bys idcode : egen double cl_age_id = mean(age) bys south : egen double cl_age_south = mean(age) you computed a large number of sample means, so the cluster-mean regressors in the Mundlak model must be considered as not single DF, but as many DF as it took to compute them, following the logic of the FE model. I could make the same mistake by applying the within transformation to Y and X and running OLS, which would have one DF for the one slope estimated. But if I used the FE estimator it would properly account for the fact that the within transformation is not free, as this is just the LSDV (dummy-variable) model, and putting in all those dummies is not free. Neither is this method. Your DF should be adjusted to reflect the creation of those cl_* variables. Kit Kit Baum | Boston College Economics & DIW Berlin | http://ideas.repec.org/e/pba1.html An Introduction to Stata Programming | http://www.stata-press.com/books/isp.html An Introduction to Modern Econometrics Using Stata | http://www.stata-press.com/books/imeus.html * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

