Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: appropriateness of cluster option with xtreg, fe

From   "Johannes Schmieder" <[email protected]>
To   [email protected]
Subject   Re: st: appropriateness of cluster option with xtreg, fe
Date   Sat, 23 Sep 2006 17:27:24 -0400

My thoughts on this: without the clustering, Stata assumes that the
underlying statistical model has 100 * 25 = 2500 observations with
independent error terms. The clustering adjusts for correlations
between the error terms over time, so you have in effect less
independent observations and you should expect your standard errors to
go up. This is nearly always the case, the example on the faq you
mentioned is more the exception (you need a strong negative
correlation between your error terms and even then it is not
necessarily the case that the SE go down). If you have reasons to
believe that error terms are not independent in a subgroup of your
observations (such as for the different time periods for a specific
individual in a panel, or e.g. for observations that are spatially
close) you should always cluster your SE.

regards, johannes

On 9/23/06, Jason Yackee <[email protected]> wrote:
Dear all,

I am trying to replicate someone else's findings.  I have unbalanced
panel data (N (units) =100, t=25).  The original analysis uses - xtreg,
fe -.  (fixed effects gls).  I can successfully replicate the original
results using -xtreg, fe -, and also when using the "robust" standard
error option: - xtreg, fe ro -.  But when I add a "cluster(panel_id)"
option the key finding in the original analysis falls into
insignificance: -xtreg dv iv, fe ro cluster(panel_id).  Standard errors
are about double for most variables when using the cluster(panel_id)
option compared to using just the -fe -ro options; coefficients are the
same, as I would expect.

Is clustering, as a general matter, statistically appropriate to perform
with -xtreg, fe- (I assume it is because Stata allows it, and Stata is
smart).  And assuming my assumption is correct, is there a good method
for determining whether clustering is warranted/justified in my
particular case?

Thoughts appreciated.  I wouldn't worry about this is clustering versus
non-clustering didn't make a key result disappear.    Also, I am aware
of Sribney's FAQ on clustering at, but he doesn't
quite address my question.

Jason Webb Yackee, Ph.D. Candidate; J.D.
Fellow, Gould School of Law
University of Southern California
[email protected]
Cell: 919-358-3040

*   For searches and help try:

Johannes F. Schmieder
Ph.D. Student
Department of Economics
Columbia University
email: [email protected]
cell: (+1) 631 903 5646
*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index