Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Austin Nichols <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Regression with about 5000 (dummy) variables |

Date |
Thu, 19 Apr 2012 11:16:29 -0400 |

John Antonakis <John.Antonakis@unil.ch>: The poster asked about multiple dimensions of fixed effects--how does the advice below relate? The approach shown actually adds to the size of the matrix to be inverted. You assert that "This will save you on degrees of freedom and computational requirements." --can you clarify that claim? Your xtreg y x1-x4 cl_x1-cl_x4, cluster(panelvar) is nearly the same as xtreg y x1-x4, fe robust right? Note that inference is not identical, as the RE estimator does not "know" the means are estimated. On Thu, Apr 19, 2012 at 10:57 AM, John Antonakis <John.Antonakis@unil.ch> wrote: > Hi: > > Let me let you in on a trick that is relatively unknown. > > One way around the problem of a huge amount of dummy variables is to use the > Mundlak procedure: > > Mundlak, Y. (1978). Pooling of Time-Series and Cross-Section Data. > Econometrica, 46(1), 69-85. > > ....for an intuitive explanation, see: > > Antonakis, J., Bendahan, S., Jacquart, P., & Lalive, R. (2010). On making > causal claims: A review and recommendations. The Leadership Quarterly, > 21(6). 1086-1120. http://www.hec.unil.ch/jantonakis/Causal_Claims.pdf > > Basically, for each time varying independent variable (x1-x4), take the > cluster mean and include that in the regression. That is, do: > > foreach var of varlist x1-x4 { > bys panelvar: egen cl_`var'=mean(`var') > } > > Then, run your regression like this: > > xtreg y x1-x4 cl_x1-cl_x4, cluster(panelvar) > > The Hausman test for fixed- versus random-effects is: > > testparm cl_x1-cl_x4 > > This will save you on degrees of freedom and computational requirements. > This estimator is consistent. Try it out with a subsample of your dataset > to see. Many econometricians have been amazed by this. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Regression with about 5000 (dummy) variables***From:*John Antonakis <John.Antonakis@unil.ch>

**References**:**st: Regression with about 5000 (dummy) variables***From:*Suryadipta Roy <sroy2138@gmail.com>

**Re: st: Regression with about 5000 (dummy) variables***From:*John Antonakis <John.Antonakis@unil.ch>

- Prev by Date:
**re: st: New module -mmsel- on SSC** - Next by Date:
**Re: st: Regression with about 5000 (dummy) variables** - Previous by thread:
**Re: st: Regression with about 5000 (dummy) variables** - Next by thread:
**Re: st: Regression with about 5000 (dummy) variables** - Index(es):