Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Regression with cluster error hangs program


From   Alfonso Sanchez-Penalver <alfonso.statalist@gmail.com>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Regression with cluster error hangs program
Date   Tue, 11 Feb 2014 15:35:52 -0500

Hi Seb,

As I'm sure you know the problem is collinearity. The simple addition of all the dummy variables is equal to a constant term (vector of ones). Which is why Stata ought to be dropping one firm dummy in your estimation. Since you're interested in the specific firm trend, I would just drop the constant and include i.firm##c.yrdiff in your pooled OLS regression (-regress-).

I'm not surprised that -xtreg, fe- was much faster because what it does is demean the variables by firm, thus not needing to include the firms' dummies, so it reduces the size of the matrices and thus makes the inversion of them much faster. If you don't need the firm specific effects then this is much more efficient way of estimating the model.

I don't know if you mentioned this before, but how many firms do you have, and how many years per firm? Is it a balanced or an unbalanced panel?

Let me know if I cleared out my point.

Alfonso Sanchez-Penalver

> On Feb 11, 2014, at 3:20 PM, Gui Deng Say <sayxx008@umn.edu> wrote:
> 
> Hi Alfonso, thanks for your reply and for your time. I see what you mean.
> 
> Could you elaborate regarding the noconstant point why it's not necessary?
> 
> At this point I do not need the coefficients for dummies. But i do
> need to take into account the time trend interaction to see if it's
> driving my results.
> 
> To save some time, I tried xtreg, fe  without including i. Firm but
> still including the i.firm#yrdiff. This cuts down about 20 percent of
> waiting time. However, when I tried to cluster by firm, the software
> once again kept running and hangs.
> 
> I'm wondering why stata crashes whenever I try to cluster errors by
> firm. Or more great broadly, under what circumstances does stata hang
> when running regressions?
> 
> Best,
> Seb
> 
>> On Tue, Feb 11, 2014 at 1:24 PM, Gui Deng Say <sayxx008@umn.edu> wrote:
>> Hi Alfonso, thanks for your reply and for your time. I see what you mean.
>> 
>> Could you elaborate regarding the noconstant point why it's not necessary?
>> 
>> At this point I do not need the coefficients for dummies. But i do need to
>> take into account the time trend interaction to see if it's driving my
>> results.
>> 
>> To save some time, I tried xtreg, fe  without including i. Firm but still
>> including the i.firm#yrdiff. This cuts down about 20 percent of waiting
>> time. However, when I tried to cluster by firm, the software once again kept
>> running and hangs.
>> 
>> I'm wondering why stata crashes whenever I try to cluster errors by firm. Or
>> more great broadly, under what circumstances does stata hang when running
>> regressions?
>> 
>> Best,
>> Seb
>> 
>> On Feb 11, 2014 6:48 AM, "Alfonso Sanchez-Penalver"
>> <alfonso.statalist@gmail.com> wrote:
>>> 
>>> Hi Seb,
>>> 
>>> A couple of comments. First if you want both the main effects and the
>>> interaction effect you can write -i.firm##c.yrdiff- instead of having to
>>> write things twice.
>>> 
>>> My second question is why do you expect further correlation of the errors
>>> by firm, which is what clustering the variance corrects for. By further
>>> correlation I mean that you are already accounting for differences in the
>>> unobserved means by firm by introducing he firms' dummies, so how would the
>>> errors be correlated within the firms now that they don't have differences
>>> in values?
>>> 
>>> Lastly I would suggest using no constant in your regression since you have
>>> both firm fixed effects and firm specific trends.
>>> 
>>> I hope this helps,
>>> 
>>> Alfonso Sanchez-Penalver
>>> 
>>>> On Feb 11, 2014, at 4:03 AM, Gui Deng Say <sayxx008@umn.edu> wrote:
>>>> 
>>>> Hi,
>>>>   I am using Stata13MP and I have two questions regarding OLS
>>>> regressions. I have an unbalanced firm-year panel consisting of 35k
>>>> observations, about 4900 firms.
>>>> 
>>>> I am trying trying to estimate the following model.
>>>> 
>>>> regress DV IV i.firm yrdiff i.firm#c.yrdiff
>>>> 
>>>> where yrdiff is a time counter variable, measured relative to a
>>>> particular year. The reason i'm using i.firm#c.yrdiff is to control
>>>> for firm specific time trend
>>>> 
>>>> q1. Firstly, estimating this model takes very long ~ 2 hours. Is this
>>>> normal? If not, what might be the reason(s)?
>>>> 
>>>> q2. Secondly I tried to cluster the standard errors by firm. i.e. i
>>>> tried this model
>>>> regress DV IV i.firm yrdiff i.firm#c.yrdiff, vce(cluster firm)
>>>> 
>>>> This regression kept running...and in the end, the Stata program
>>>> freezes. Any ideas?
>>>> 
>>>> Many thanks,
>>>> Seb
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>> 
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index