Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable

 From Jacob Felson To statalist@hsphsun2.harvard.edu Subject st: Direction of the effect of the cluster command on the standard error depends on the inclusion of a control variable Date Wed, 5 Jan 2011 19:01:58 -0500

```I wonder if anyone might be able to provide an explanation for the
following scenario.  I'm wondering why the direction of the change in
a standard error affected by the use of the cluster command depends on
the whether another control variable is included.  My inquiry is more
theoretical than practical, as I'm not wondering "what I should do"
but rather, simply "why is this happening?"   Let me elaborate below.

Consider the following variables:

y, the dependent variable
x, the independent variable of greatest interest, which is moderately
correlated with y and with z
z, another independent variable, which is correlated with y at about 0.5.

nation - the data was collected in 4 different nations by different
organizations.

I am examining the standard errors (SE) for the coefficient of
variable x from the following four models:

1. Regress y on x, without clustering on nation.
2. Regress y on x, with clustering on nation.

3. Regress y on x and z without clustering on nation.
4. Regress y on x and z with clustering on nation.

The SE of the coefficient for x is LARGER in model 2 than in model 1.
This suggests there is a positive intercluster correlation.  That is,
the residuals are more similar to each other within nations than we
would expect by chance alone.  I suppose there is a preponderance of
positive residuals in some nations and a preponderance of negative
residuals in other nations.

The SE of the coefficient for x is SMALLER in model 4 than in model 3.
This suggests there is a negative intercluster correlation.  That is,
the residuals are less similar to each other within nations than we
would expect by chance.

So the effect that clustering on nation has on the SE of x depends on
whether a third variable, z, is controlled.  Why is this?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```