Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Austin Nichols <austinnichols@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: Regression with cluster error hangs program |
Date | Tue, 11 Feb 2014 09:30:55 -0500 |
Gui Deng Say <sayxx008@umn.edu>: You just have very large matrices to invert. I think you are right to cluster, since that will address arbitrary serial correlation, but since you cannot really interpret your fixed effects or panel-specific trend estimates, perhaps you should demean and detrend first, then run your regression. Instead of regress DV IV i.firm##c.yrdiff, cl(firm) try egen id=group(firm) su id, mean loc max=r(max) g double d_DV=. g double d_IV=. forv i=1/`max' { foreach v in DV IV { cap reg `v' yrdiff if id==`i' predict double e, res replace d_`v'=e if id==`i' drop e } } reg d_DV d_IV, cl(firm) The loops can be sped up with a little algebra and the by: prefix, i.e. you can calculate those demeaned and detrended variables using generate instead of regress, predict, replace, but you have to do a little work to figure it out: http://www.stata.com/statalist/archive/2008-10/msg00136.html http://www.stata.com/statalist/archive/2011-09/msg00855.html http://www.stata.com/statalist/archive/2011-01/msg00662.html http://www.stata.com/statalist/archive/2008-04/msg00967.html On Tue, Feb 11, 2014 at 7:47 AM, Alfonso Sanchez-Penalver <alfonso.statalist@gmail.com> wrote: > Hi Seb, > > A couple of comments. First if you want both the main effects and the interaction effect you can write -i.firm##c.yrdiff- instead of having to write things twice. > > My second question is why do you expect further correlation of the errors by firm, which is what clustering the variance corrects for. By further correlation I mean that you are already accounting for differences in the unobserved means by firm by introducing he firms' dummies, so how would the errors be correlated within the firms now that they don't have differences in values? > > Lastly I would suggest using no constant in your regression since you have both firm fixed effects and firm specific trends. > > I hope this helps, > > Alfonso Sanchez-Penalver > >> On Feb 11, 2014, at 4:03 AM, Gui Deng Say <sayxx008@umn.edu> wrote: >> >> Hi, >> I am using Stata13MP and I have two questions regarding OLS >> regressions. I have an unbalanced firm-year panel consisting of 35k >> observations, about 4900 firms. >> >> I am trying trying to estimate the following model. >> >> regress DV IV i.firm yrdiff i.firm#c.yrdiff >> >> where yrdiff is a time counter variable, measured relative to a >> particular year. The reason i'm using i.firm#c.yrdiff is to control >> for firm specific time trend >> >> q1. Firstly, estimating this model takes very long ~ 2 hours. Is this >> normal? If not, what might be the reason(s)? >> >> q2. Secondly I tried to cluster the standard errors by firm. i.e. i >> tried this model >> regress DV IV i.firm yrdiff i.firm#c.yrdiff, vce(cluster firm) >> >> This regression kept running...and in the end, the Stata program >> freezes. Any ideas? >> >> Many thanks, >> Seb >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/