Ronggui <[email protected]> wrote about problems getting -xtreg , fe-
to replicate the cluster-robust standard errors discussed in Econometric
Analysis of Cross Section and Panel Data (Wooldridge,2002:pp274-276).
The short answer is that while Wooldridge discusses and reports cluster-robust
standard errors, Ronggui asked -xtreg ,fe- to report robust standard errors.
Clustering on the panel variable produces results that are asymptotically
equivalent to those reported by Wooldridge.
The remainder of this response discusses the difference between robust and
cluster-robust standard errors and shows how to undo the small-sample
adjustment and exactly reproduce the Wooldridge results.
In a regression, robust standard errors are consistent when the errors are
not identically distributed. The most common example of not identically
distributed is conditional heteroskedasticity. In addition to handling
not-identically-distributed errors, cluster-robust standard errors are
consistent when the errors are correlated within the clusters. A common
example of the latter case is serially correlated errors in a panel-data
regression.
In Stata, the -robust- option produces robust standard errors and the
-cluster(varname)- produces cluster-robust standard errors.
On page 275 Wooldridge discusses the cluster-robust standard errors
originally suggested by Arellano (1987). However, as is common in
econometrics, Wooldridge refers to these cluster-robust standard errors
simply as ``robust" standard errors.
In an attempt to replicate the cluster-robust standard on page 276, Ronggui
<[email protected]> specified the -robust- option to -xtreg , fe-. Of
course, the -robust- standard errors are not the same as the cluster-robust
standard errors. To match the results in Wooldridge, Ronggui needs to
cluster on the panel id variable, which I do in the output below.
. use http://www.stata.com/data/jwooldridge/eacsap/jtrain1.dta
. xtreg lscrap d88 d89 grant grant_1 , fe i(fcode ) robust cluster(fcode )
Fixed-effects (within) regression Number of obs = 162
Group variable (i): fcode Number of groups = 54
R-sq: within = 0.2010 Obs per group: min = 3
between = 0.0079 avg = 3.0
overall = 0.0068 max = 3
F(4,158) = 7.07
corr(u_i, Xb) = -0.0714 Prob > F = 0.0000
(Std. Err. adjusted for 54 clusters in fcode)
------------------------------------------------------------------------------
| Robust
lscrap | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
d88 | -.0802157 .0978408 -0.82 0.416 -.2764594 .1160281
d89 | -.2472028 .1967819 -1.26 0.215 -.6418973 .1474917
grant | -.2523149 .1434399 -1.76 0.084 -.5400188 .035389
grant_1 | -.4215895 .2824604 -1.49 0.141 -.9881333 .1449543
_cons | .5974341 .0638746 9.35 0.000 .4693177 .7255504
-------------+----------------------------------------------------------------
sigma_u | 1.438982
sigma_e | .49774421
rho | .89313867 (fraction of variance due to u_i)
------------------------------------------------------------------------------
While the standard errors reported above are asymptotically equivalent to
the ones reported by Wooldridge and those that Ronggui obtained by hand,
they still differ. The difference is due to the finite sample adjustment
used by Stata, and recommended by the complex survey design literature.
(See page 54 R[R-Z] for a discussion of the finite sample adjustment q =
(N-1)/(N-k) M/(M-1), where N is the number of observations, k is number of
estimated coefficients and M is the number of clusters.)
The following output illustrates the calculations for the example at hand.
. mat V = e(V)
. local k = rowsof(V)
. local N = e(N)
. local M = e(N_clust)
. scalar q = (`N'-1)/(`N'-`k') * `M'/(`M'-1)
. di sqrt(q)
1.0221675
. di _se[d88]/sqrt(q)
.09571894
Note that the last of these calculations matches Ronggui's hand calculation
exactly. The asymptotic equivalence follows from the fact that q --> 1 as
then number of panels get large.
-David
[email protected]
Ronggui's hand calculations
Coef Robust Std.Err Std.Err T-value P-value
d88 -0.08021568 0.09571894 0.1094751 -0.7327297 0.46537157
d89 -0.24720280 0.19251435 0.1332183 -1.8556221 0.06633915
grant -0.25231487 0.14032911 0.1506290 -1.6750751 0.09692392
grant.1 -0.42158951 0.27633474 0.2102000 -2.0056593 0.04748974
References
----------
Arellano, M. 1987. "Computing Robust Standard Errors for Within-Group
Estimators," Oxford Bulletin of Economics and Statistics 49:431-434.
Wooldridge, J. M. 2002. Econometric analysis of Cross-section and Panel
data. Cambridge, Mass: The MIT Press.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/