Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Stas Kolenikov <skolenik@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable |

Date |
Wed, 5 Jan 2011 19:02:08 -0600 |

There are terrible small sample biases exhibited by -robust- and -cluster()- standard errors with small # of observations and clusters, respectively. As was noted by Justina, four clusters is SO far away from asymptotics that I wouldn't even consider the clustered standard errors in your situation. On Wed, Jan 5, 2011 at 6:01 PM, Jacob Felson <felsonj@gmail.com> wrote: > I wonder if anyone might be able to provide an explanation for the > following scenario. I'm wondering why the direction of the change in > a standard error affected by the use of the cluster command depends on > the whether another control variable is included. My inquiry is more > theoretical than practical, as I'm not wondering "what I should do" > but rather, simply "why is this happening?" Let me elaborate below. > > Consider the following variables: > > y, the dependent variable > x, the independent variable of greatest interest, which is moderately > correlated with y and with z > z, another independent variable, which is correlated with y at about 0.5. > > nation - the data was collected in 4 different nations by different > organizations. > > > I am examining the standard errors (SE) for the coefficient of > variable x from the following four models: > > 1. Regress y on x, without clustering on nation. > 2. Regress y on x, with clustering on nation. > > 3. Regress y on x and z without clustering on nation. > 4. Regress y on x and z with clustering on nation. > > > The SE of the coefficient for x is LARGER in model 2 than in model 1. > This suggests there is a positive intercluster correlation. That is, > the residuals are more similar to each other within nations than we > would expect by chance alone. I suppose there is a preponderance of > positive residuals in some nations and a preponderance of negative > residuals in other nations. > > The SE of the coefficient for x is SMALLER in model 4 than in model 3. > This suggests there is a negative intercluster correlation. That is, > the residuals are less similar to each other within nations than we > would expect by chance. > > > So the effect that clustering on nation has on the SE of x depends on > whether a third variable, z, is controlled. Why is this? > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- Stas Kolenikov, also found at http://stas.kolenikov.name Small print: I use this email account for mailing lists only. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable***From:*Austin Nichols <austinnichols@gmail.com>

**References**:**st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable***From:*Jacob Felson <felsonj@gmail.com>

- Prev by Date:
**Re: st: SMCL not showing properly** - Next by Date:
**Re: st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable** - Previous by thread:
**RE: Antwort: RE: st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable** - Next by thread:
- Index(es):