Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Regression with about 5000 (dummy) variables

From	Austin Nichols <[email protected]>
To	[email protected]
Subject	Re: st: Regression with about 5000 (dummy) variables
Date	Thu, 19 Apr 2012 11:16:29 -0400

John Antonakis <[email protected]>:
The poster asked about multiple dimensions of fixed effects--how does
the advice below relate?
The approach shown actually adds to the size of the matrix to be inverted.
You assert that
"This will save you on degrees of freedom and computational requirements."
--can you clarify that claim?
Your
 xtreg y x1-x4 cl_x1-cl_x4, cluster(panelvar)
is nearly the same as
 xtreg y x1-x4, fe robust
right? Note that inference is not identical, as the RE estimator
does not "know" the means are estimated.

On Thu, Apr 19, 2012 at 10:57 AM, John Antonakis <[email protected]> wrote:
> Hi:
>
> Let me let you in on a trick that is relatively unknown.
>
> One way around the problem of a huge amount of dummy variables is to use the
> Mundlak procedure:
>
> Mundlak, Y. (1978). Pooling of Time-Series and Cross-Section Data.
> Econometrica, 46(1), 69-85.
>
> ....for an intuitive explanation, see:
>
> Antonakis, J., Bendahan, S., Jacquart, P., & Lalive, R. (2010). On making
> causal claims: A review and recommendations. The Leadership Quarterly,
> 21(6). 1086-1120. http://www.hec.unil.ch/jantonakis/Causal_Claims.pdf
>
> Basically, for each time varying independent variable (x1-x4), take the
> cluster mean and include that in the regression.  That is, do:
>
> foreach var of varlist x1-x4 {
> bys panelvar: egen cl_`var'=mean(`var')
> }
>
> Then, run your regression like this:
>
> xtreg y x1-x4 cl_x1-cl_x4, cluster(panelvar)
>
> The Hausman test for fixed- versus random-effects is:
>
> testparm cl_x1-cl_x4
>
> This will save you on degrees of freedom and computational requirements.
> This estimator is consistent.  Try it out with a subsample of your dataset
> to see. Many econometricians have been amazed by this.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Regression with about 5000 (dummy) variables
  - From: John Antonakis <[email protected]>

References:
- st: Regression with about 5000 (dummy) variables
  - From: Suryadipta Roy <[email protected]>
- Re: st: Regression with about 5000 (dummy) variables
  - From: John Antonakis <[email protected]>

Prev by Date: re: st: New module -mmsel- on SSC
Next by Date: Re: st: Regression with about 5000 (dummy) variables
Previous by thread: Re: st: Regression with about 5000 (dummy) variables
Next by thread: Re: st: Regression with about 5000 (dummy) variables
Index(es):
- Date
- Thread