Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: re: panel re vs. fe model


From   Kit Baum <[email protected]>
To   [email protected]
Subject   st: re: panel re vs. fe model
Date   Thu, 18 Jan 2007 14:15:52 -0500

Jason said

My own understanding of clustering (on ID) is that it is NOT a
substitute for, or alternative to, including actual ID fixed effects.
>From a recent article in Biometrics: "Clustering deals with
cluster-correlated data, which arises from intracluster correlation, or
the potential for clustermates to respond similarly. This phenomenon is
often referred to as overdispersion or extra variation in an estimated
statistic beyond what would be expected under independence. Analyses
that assume independence will generally underestimate the true variance
and lead to test statistics with inflated Type I errors."

In other words, where within-group correlations are high, we can expect
tests of statistical significance to be biased toward unjustifiably
rejecting the null hypothesis of no statistically significant
relationship. Clustering is easily extended to other kinds of analyses,
and it is perfectly compatible with the simultaneous conclusion of group
fixed effects or group dummy variables, and there is nothing
inappropriate about including in a given model country fixed effects
while at the same time controlling for within-country variance
correlation by clustering by country. Fixed effects are usually
employed solely to control for potential omitted-variable bias affecting
the estimated coefficients. Clustering addresses the entirely different
problem of within-group correlation of variance, and it "works", in most
cases, by adjusting standard errors upward. Clustering will not affect
coefficient estimates; including fixed effects nearly always will.

I am not a statistician, however, so maybe I am missing something here.


I am not a statistician either, but I agree with Jason that these should not be confused. If you have panel data and run pooled OLS, you are asserting that the same intercept term is appropriate for each unit (the hypothesis tested at the foot of xtreg, fe). If that model is inappropriate, then you are estimating OLS with specification error, viz. leaving out the unit dummies. The coeffs. that you do estimate from pooled OLS are biased and potentially inconsistent, just as they would be if you left other regressors out. Ignoring heterogeneity in this context is dangerous.

Let's say your xtreg.fe model is properly specified. Then you can question the i.i.d. error assumption implicit in this regression-with-dummy-variables. Should you allow for robust s.e.? Should you allow for cluster-robust s.e.? Should you allow for serial correlation within units' errors? That is an issue of starting with a valid set of point estimates and trying to get the VCE right. We want both consistent point and interval ests., and worrying about the VCE is not useful if the point estimates lack desirable properties.

Kit Baum, Boston College Economics
http://ideas.repec.org/e/pba1.html
An Introduction to Modern Econometrics Using Stata:
http://www.stata-press.com/books/imeus.html

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index