David Jacobs

Re: st: Efficient way to run regressions with many dummy variables?

Mon, 27 Apr 2009 16:58:52 -0400

Look up the Stata routine called -areg-. Dave Jacobs At 03:32 PM 4/27/2009, you wrote:

Dear Stata-list, I am using a data set of 963,966 observations, with 26 variables (after dropping all variables not needed for my estimation). The observations are dyadic observations, I have in fact (1400 squared)/2 pairs of observations (divided by 2 because the relationship is non directional) and so in the regressions, I need to control for 1400*2 dummy variables. I run a regression of the form: xi: reg y x1 x2 x3 i.observation1 i.observation2 where my dataset consists of dyadic relationships between each observation1 and each observation2. The problem I run into is that each regression takes an incredibly long time (and the server crashes regularly). In an alternative regression, I use Fafchamps and Gubert NGREG: I run: xi: ngreg y x1 x2 x3, id(observation1 observation2) This also takes an incredibly long time. My question is: Is there a more efficient way to run regressions in stata with such an enormous amount of dummy variables? PS: I do not care about the coefficient on the dummies per se. Thank you very much in advance for your response. Pauline -- Pauline Grosjean Ciriacy Wantrup Fellow, Department of Agricultural and Resource Economics University of California Berkeley Web page: http://are.berkeley.edu/~pgrosjean/ Mobile: 510 384 0141

