Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: It seems streg's robust STD.ERROR is not correct


From   "David M. Drukker, StataCorp" <[email protected]>
To   [email protected]
Subject   RE: st: It seems streg's robust STD.ERROR is not correct
Date   Mon, 13 Feb 2006 14:56:19 -0600

Ronggui <[email protected]> wrote about problems getting -xtreg , fe-
to replicate the cluster-robust standard errors discussed in Econometric
Analysis of Cross Section and Panel Data (Wooldridge,2002:pp274-276).

The short answer is that while Wooldridge discusses and reports cluster-robust
standard errors, Ronggui asked -xtreg ,fe- to report robust standard errors.
Clustering on the panel variable produces results that are asymptotically
equivalent to those reported by Wooldridge.

The remainder of this response discusses the difference between robust and
cluster-robust standard errors and shows how to undo the small-sample
adjustment and exactly reproduce the Wooldridge results.

In a regression, robust standard errors are consistent when the errors are
not identically distributed.  The most common example of not identically
distributed is conditional heteroskedasticity.  In addition to handling
not-identically-distributed errors, cluster-robust standard errors are
consistent when the errors are correlated within the clusters.  A common
example of the latter case is serially correlated errors in a panel-data
regression.

In Stata, the -robust- option produces robust standard errors and the
-cluster(varname)- produces cluster-robust standard errors.

On page 275 Wooldridge discusses the cluster-robust standard errors
originally suggested by Arellano (1987).  However, as is common in
econometrics, Wooldridge refers to these cluster-robust standard errors
simply as ``robust" standard errors.

In an attempt to replicate the cluster-robust standard on page 276, Ronggui
<[email protected]> specified the -robust- option to -xtreg , fe-.  Of
course, the -robust- standard errors are not the same as the cluster-robust
standard errors.  To match the results in Wooldridge, Ronggui needs to
cluster on the panel id variable, which I do in the output below.

. use http://www.stata.com/data/jwooldridge/eacsap/jtrain1.dta

. xtreg lscrap d88 d89 grant grant_1 , fe i(fcode ) robust cluster(fcode )

Fixed-effects (within) regression               Number of obs      =       162
Group variable (i): fcode                       Number of groups   =        54

R-sq:  within  = 0.2010                         Obs per group: min =         3
       between = 0.0079                                        avg =       3.0
       overall = 0.0068                                        max =         3

                                                F(4,158)           =      7.07
corr(u_i, Xb)  = -0.0714                        Prob > F           =    0.0000

                                 (Std. Err. adjusted for 54 clusters in fcode)
------------------------------------------------------------------------------
             |               Robust
      lscrap |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         d88 |  -.0802157   .0978408    -0.82   0.416    -.2764594    .1160281
         d89 |  -.2472028   .1967819    -1.26   0.215    -.6418973    .1474917
       grant |  -.2523149   .1434399    -1.76   0.084    -.5400188     .035389
     grant_1 |  -.4215895   .2824604    -1.49   0.141    -.9881333    .1449543
       _cons |   .5974341   .0638746     9.35   0.000     .4693177    .7255504
-------------+----------------------------------------------------------------
     sigma_u |   1.438982
     sigma_e |  .49774421
         rho |  .89313867   (fraction of variance due to u_i)
------------------------------------------------------------------------------

While the standard errors reported above are asymptotically equivalent to
the ones reported by Wooldridge and those that Ronggui obtained by hand,
they still differ.  The difference is due to the finite sample adjustment
used by Stata, and recommended by the complex survey design literature.
(See page 54 R[R-Z] for a discussion of the finite sample adjustment q =
(N-1)/(N-k) M/(M-1), where N is the number of observations, k is number of
estimated coefficients and M is the number of clusters.)

The following output illustrates the calculations for the example at hand.

. mat V = e(V)

. local k = rowsof(V)

. local N = e(N)

. local M = e(N_clust)

. scalar q = (`N'-1)/(`N'-`k') * `M'/(`M'-1)

. di sqrt(q)
1.0221675

. di _se[d88]/sqrt(q)
.09571894

Note that the last of these calculations matches Ronggui's hand calculation
exactly.  The asymptotic equivalence follows from the fact that q --> 1 as
then number of panels get large. 


     -David
      [email protected]


Ronggui's hand calculations      

               Coef Robust Std.Err   Std.Err    T-value    P-value
d88     -0.08021568     0.09571894 0.1094751 -0.7327297 0.46537157
d89     -0.24720280     0.19251435 0.1332183 -1.8556221 0.06633915
grant   -0.25231487     0.14032911 0.1506290 -1.6750751 0.09692392
grant.1 -0.42158951     0.27633474 0.2102000 -2.0056593 0.04748974

References
----------

Arellano, M. 1987. "Computing Robust Standard Errors for Within-Group
Estimators,"  Oxford Bulletin of Economics and Statistics 49:431-434.

Wooldridge, J. M.  2002. Econometric analysis of Cross-section and Panel
data.  Cambridge, Mass: The MIT Press.


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index