Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Austin Nichols <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable |

Date |
Wed, 5 Jan 2011 20:53:34 -0500 |

Jacob Felson <felsonj@gmail.com> : You should have at least 20 clusters and your smallest cluster should be at least 5% of the data (i.e. 20 balanced clusters, or more unbalanced clusters; see e.g. http://www.stata.com/meeting/13uk/nichols_crse.pdf) to feel comfortable with the cluster-robust SE estimator. But to answer your original question, the residuals are quite different after you include z as a regressor, so the intracluster correlation can also be quite different. On Wed, Jan 5, 2011 at 8:02 PM, Stas Kolenikov <skolenik@gmail.com> wrote: > There are terrible small sample biases exhibited by -robust- and > -cluster()- standard errors with small # of observations and clusters, > respectively. As was noted by Justina, four clusters is SO far away > from asymptotics that I wouldn't even consider the clustered standard > errors in your situation. > > On Wed, Jan 5, 2011 at 6:01 PM, Jacob Felson <felsonj@gmail.com> wrote: >> I wonder if anyone might be able to provide an explanation for the >> following scenario. I'm wondering why the direction of the change in >> a standard error affected by the use of the cluster command depends on >> the whether another control variable is included. My inquiry is more >> theoretical than practical, as I'm not wondering "what I should do" >> but rather, simply "why is this happening?" Let me elaborate below. >> >> Consider the following variables: >> >> y, the dependent variable >> x, the independent variable of greatest interest, which is moderately >> correlated with y and with z >> z, another independent variable, which is correlated with y at about 0.5. >> >> nation - the data was collected in 4 different nations by different >> organizations. >> >> >> I am examining the standard errors (SE) for the coefficient of >> variable x from the following four models: >> >> 1. Regress y on x, without clustering on nation. >> 2. Regress y on x, with clustering on nation. >> >> 3. Regress y on x and z without clustering on nation. >> 4. Regress y on x and z with clustering on nation. >> >> >> The SE of the coefficient for x is LARGER in model 2 than in model 1. >> This suggests there is a positive intercluster correlation. That is, >> the residuals are more similar to each other within nations than we >> would expect by chance alone. I suppose there is a preponderance of >> positive residuals in some nations and a preponderance of negative >> residuals in other nations. >> >> The SE of the coefficient for x is SMALLER in model 4 than in model 3. >> This suggests there is a negative intercluster correlation. That is, >> the residuals are less similar to each other within nations than we >> would expect by chance. >> >> >> So the effect that clustering on nation has on the SE of x depends on >> whether a third variable, z, is controlled. Why is this? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable***From:*DE SOUZA Eric <eric.de_souza@coleurope.eu>

**Antwort: Re: st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable***From:*Justina Fischer <JFischer@diw.de>

**References**:**st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable***From:*Jacob Felson <felsonj@gmail.com>

**Re: st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable***From:*Stas Kolenikov <skolenik@gmail.com>

- Prev by Date:
**Re: st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable** - Next by Date:
**Fwd: st: Ranking derived from sports scores** - Previous by thread:
- Next by thread:
**Antwort: Re: st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable** - Index(es):