Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
John Antonakis <John.Antonakis@unil.ch> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Regression with about 5000 (dummy) variables |

Date |
Thu, 19 Apr 2012 16:57:27 +0200 |

Hi: Let me let you in on a trick that is relatively unknown.

....for an intuitive explanation, see:

foreach var of varlist x1-x4 { bys panelvar: egen cl_`var'=mean(`var') } Then, run your regression like this: xtreg y x1-x4 cl_x1-cl_x4, cluster(panelvar) The Hausman test for fixed- versus random-effects is: testparm cl_x1-cl_x4

HTH, J. __________________________________________ Prof. John Antonakis Faculty of Business and Economics Department of Organizational Behavior University of Lausanne Internef #618 CH-1015 Lausanne-Dorigny Switzerland Tel ++41 (0)21 692-3438 Fax ++41 (0)21 692-3305 http://www.hec.unil.ch/people/jantonakis Associate Editor The Leadership Quarterly __________________________________________ On 19.04.2012 16:39, Suryadipta Roy wrote: > Dear Statalisters, > > I am trying to run a fixed effects panel regression which has more > than 4000 dummies (based on theory in the gravity model literature in > inernational economics), and hence close to 5000 variables in the > regression. The coefficients of the dummy variables are not of any > interest. The code is as follows: xtreg y x1 x2...... imp_time_* > exp_time_*, fe cluster(panelvar), where panelvar has been set using - > xtset- , and imp_time and exp_time are importer-time and exporter-time > fixed effects respectively. However, the regression had run close to 2 > hours without generating any result at which I stopped it using > -Break- . I had set the memory to 5000m, and the matsize to 5000 using > -set- . > > My Stata specification is Stata/SE 11.2 for Windows (64-bit x86-64). > My PC specification: Processor- intel core i5-2430M CPU @ 2.40GhZ; > RAM- 8 GB, in a 64-bit OS. > > I would have greatly appreciated some help to find out if this is > normal for Stata to take this much time (or more) in the presence of a > large number of variables, and if there is a way to accomplish the > task faster. The gravity literature has suggested a couple of ways to > do this without the dummy variable approach, but I was trying to find > out if there is a better way to do it if I persist with the dummy > variables. Any help is greatly appreciated. > > Best regards, > Suryadipta. > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Regression with about 5000 (dummy) variables***From:*clivelists@googlemail.com

**Re: st: Regression with about 5000 (dummy) variables***From:*Austin Nichols <austinnichols@gmail.com>

**Re: st: Regression with about 5000 (dummy) variables***From:*Suryadipta Roy <sroy2138@gmail.com>

**References**:**st: Regression with about 5000 (dummy) variables***From:*Suryadipta Roy <sroy2138@gmail.com>

- Prev by Date:
**Re: st: Regression with about 5000 (dummy) variables** - Next by Date:
**st: Study design question; MLB; pay and performance.** - Previous by thread:
**Re: st: Regression with about 5000 (dummy) variables** - Next by thread:
**Re: st: Regression with about 5000 (dummy) variables** - Index(es):