Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Regression with about 5000 (dummy) variables

From   Suryadipta Roy <>
Subject   st: Regression with about 5000 (dummy) variables
Date   Thu, 19 Apr 2012 10:39:43 -0400

Dear Statalisters,

I am  trying to run a fixed effects panel regression which has more
than 4000 dummies (based on theory in the gravity model literature in
inernational economics), and hence close to 5000 variables in the
regression. The coefficients of the dummy variables are not of any
interest. The code is as follows: xtreg y x1 x2...... imp_time_*
exp_time_*, fe cluster(panelvar), where panelvar has been set using -
xtset- , and imp_time and exp_time are importer-time and exporter-time
fixed effects respectively. However, the regression had run close to 2
hours without generating any result at which I stopped it using
-Break- . I had set the memory to 5000m, and the matsize to 5000 using
-set- .

My Stata specification is Stata/SE 11.2 for Windows (64-bit x86-64).
My PC specification: Processor- intel core i5-2430M CPU @ 2.40GhZ;
RAM- 8 GB, in a 64-bit OS.

I would have greatly appreciated some help to find out if this is
normal for Stata to take this much time (or more) in the presence of a
large number of variables, and if there is a way to accomplish the
task faster. The gravity literature has suggested a couple of ways to
do this without the dummy variable approach, but I was trying to find
out if there is a better way to do it if I persist with the dummy
variables. Any help is greatly appreciated.

Best regards,
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index