[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Using the cluster command or GLS random effects?

From	Mark Schaffer <[email protected]>
To	[email protected]
Subject	Re: st: Using the cluster command or GLS random effects?
Date	Fri, 18 Jul 2003 10:55:49 +0100 (BST)

Sandra,

Just a short note to complement what Buzz says.

With -xtreg- you say that pupils within schools give you observations that 
are not independent, and you model this explicitly as your "random effect" 
or "fixed effect".  There are some precise distributional assumption that 
you are making about the correlation of pupils within schools (e.g., the 
within-school correlation takes the same form for all schools, that each 
pupil within a school is correlated equally with any other pupil in the 
school).

With -regress- and -cluster- you don't model this explicitly.  Instead, you 
allow for arbitrary correlation within schools, and the form of this 
correlation can vary from school to school.

Trade offs:

-xtreg- gives you more efficient estimates if your modelling of the 
correlation caused by clustering is correct.  If it isn't, your coeffs and 
SEs are wrong.

-regress- with -cluster- gives you consistent estimates across a broad 
range of possible forms of the correlation, but they won't be as efficient 
as when you know the exact form (and you're right).  (This is why your SEs 
for -regress- are bigger.)

Hope this helps.

--Mark

Quoting Buzz Burhans <[email protected]>:

> Sandra,
> 
> The "two approaches do the same thing" in the sense that they both
> account 
> for the lack of independence of pupils within schools, however, they
> 
> provide different estimates for error.  Remember two issues with
> this type 
> of data...the pupils are not independent within school, i.e. pupils
> form 
> the same school are likely to be more similar that pupils form
> different 
> schools for all kinds of putative reasons such as socio-economic
> factors or 
> school educational management factors , and this lack of
> independence 
> within school engenders the need for a cluster approach whether by
> xtreg or 
> the cluster option.
> 
>   Secondly,  the "error" or noise associated with a fitted value for
> a 
> pupil contains at least two components, one for the unique noise
> associated 
> with the pupil, and another for the noise due to variance between
> schools. 
> Regress with the (cluster) option relaxes the assumption of
> independence, 
> and therefore, compared with regress without the cluster option,
> increases 
> the error term to accommodate the violation of the assumption that
> the 
> errors are independent, but leaves both the noise associated with 
> differences between pupils and noise associated with differences
> between 
> schools in the error term.  Xtreg is different. In the "random
> effect" 
> model, xtreg fits an additional parameter, the Ui term, or random
> school 
> term, which accounts for the differences between schools, thus the
> residual 
> error term contains the "within school" variance between pupils, but
> the 
> between school portion (which remained in the regress model) is now
> removed 
> and accounted for by the weighted "between" estimator, and thus the
> error 
> is reduced.
> 
> Buzz Burhans
> [email protected]
> 
> At 01:16 PM 7/17/03 +0100, you wrote:
> >Dear all,
> >
> >I am using a repeated cross-section of pupil-level data to regress
> exam
> >attainment on various characteristics. Since pupils are clustered
> in 
> >particular
> >schools, I need to correct the standard errors for clustering at
> school-level.
> >
> >I could adopt one of the following approaches:
> >
> >regress Y X, cluster(school)
> >xtreg Y X, re (i=school)
> >
> >So the first approach corrects standard errors by using the cluster
> command.
> >The second approach uses a random effects GLS approach.
> >
> >I thought that the two approaches do the same thing and should give
> the
> >same results. However, I find that the standard errors are alot
> smaller
> >using the second approach.
> >
> >Does anyone know how the two approaches differ from one another?
> >
> >Thanks,
> >
> >Sandra
> >
> >
> >*
> >*   For searches and help try:
> >*   http://www.stata.com/support/faqs/res/findit.html
> >*   http://www.stata.com/support/statalist/faq
> >*   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 

Prof. Mark Schaffer
Director, CERT
Department of Economics
School of Management & Languages
Heriot-Watt University, Edinburgh EH14 4AS
tel +44-131-451-3494 / fax +44-131-451-3008
email: [email protected]
web: http://www.sml.hw.ac.uk/ecomes
________________________________________________________________

DISCLAIMER:

This e-mail and any files transmitted with it are confidential
and intended solely for the use of the individual or entity to
whom it is addressed.  If you are not the intended recipient
you are prohibited from using any of the information contained
in this e-mail.  In such a case, please destroy all copies in
your possession and notify the sender by reply e-mail.  Heriot
Watt University does not accept liability or responsibility
for changes made to this e-mail after it was sent, or for
viruses transmitted through this e-mail.  Opinions, comments,
conclusions and other information in this e-mail that do not
relate to the official business of Heriot Watt University are
not endorsed by it.
________________________________________________________________
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: st: Using the cluster command or GLS random effects?
  - From: Buzz Burhans <[email protected]>

Prev by Date: RE: st: PDF Stata 8 manuals
Next by Date: RE: st: 2SLS with nonlinear exogenous variables
Previous by thread: Re: st: Using the cluster command or GLS random effects?
Next by thread: Re: st: Using the cluster command or GLS random effects?
Index(es):
- Date
- Thread