Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Efficient way to run regressions with many dummy variables?


From   Pauline Grosjean <[email protected]>
To   [email protected]
Subject   st: Efficient way to run regressions with many dummy variables?
Date   Mon, 27 Apr 2009 12:32:43 -0700 (PDT)

Dear Stata-list,

I am using a data set of 963,966 observations, with 26 variables (after
dropping all variables not needed for my estimation). The observations are
dyadic observations, I have in fact (1400 squared)/2 pairs of observations
 (divided by 2 because the relationship is non directional) and so in the
regressions, I need to control for 1400*2 dummy variables. I run a
regression of the form:
xi: reg y x1 x2 x3 i.observation1 i.observation2
where my dataset consists of dyadic relationships between each
observation1 and each observation2.

The problem I run into is that each regression takes an incredibly long
time (and the server crashes regularly).

In an alternative regression, I use Fafchamps and Gubert NGREG: I run:
xi: ngreg  y x1 x2 x3, id(observation1 observation2)
This also takes an incredibly long time.

My question is: Is there a more efficient way to run regressions in stata
with such an enormous amount of dummy variables?

PS: I do not care about the coefficient on the dummies per se.

Thank you very much in advance for your response.

Pauline

-- 
Pauline Grosjean
Ciriacy Wantrup Fellow, Department of Agricultural and Resource Economics
University of California Berkeley
Web page: http://are.berkeley.edu/~pgrosjean/
Mobile: 510 384 0141

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index