Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Using the cluster command or GLS random effects?

From   Buzz Burhans <>
Subject   Re: st: Using the cluster command or GLS random effects?
Date   Thu, 17 Jul 2003 10:43:10 -0400


The "two approaches do the same thing" in the sense that they both account for the lack of independence of pupils within schools, however, they provide different estimates for error. Remember two issues with this type of data...the pupils are not independent within school, i.e. pupils form the same school are likely to be more similar that pupils form different schools for all kinds of putative reasons such as socio-economic factors or school educational management factors , and this lack of independence within school engenders the need for a cluster approach whether by xtreg or the cluster option.

Secondly, the "error" or noise associated with a fitted value for a pupil contains at least two components, one for the unique noise associated with the pupil, and another for the noise due to variance between schools. Regress with the (cluster) option relaxes the assumption of independence, and therefore, compared with regress without the cluster option, increases the error term to accommodate the violation of the assumption that the errors are independent, but leaves both the noise associated with differences between pupils and noise associated with differences between schools in the error term. Xtreg is different. In the "random effect" model, xtreg fits an additional parameter, the Ui term, or random school term, which accounts for the differences between schools, thus the residual error term contains the "within school" variance between pupils, but the between school portion (which remained in the regress model) is now removed and accounted for by the weighted "between" estimator, and thus the error is reduced.

Buzz Burhans

At 01:16 PM 7/17/03 +0100, you wrote:

Dear all,

I am using a repeated cross-section of pupil-level data to regress exam
attainment on various characteristics. Since pupils are clustered in particular
schools, I need to correct the standard errors for clustering at school-level.

I could adopt one of the following approaches:

regress Y X, cluster(school)
xtreg Y X, re (i=school)

So the first approach corrects standard errors by using the cluster command.
The second approach uses a random effects GLS approach.

I thought that the two approaches do the same thing and should give the
same results. However, I find that the standard errors are alot smaller
using the second approach.

Does anyone know how the two approaches differ from one another?



* For searches and help try:

*   For searches and help try:

© Copyright 1996–2022 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index