In 2/13/07, Richard Williams <Richard.A.Williams.5@nd.edu> wrote:
At 11:42 AM 2/13/2007, German Rodriguez wrote:
>David Freedman has a provocative answer in "On the so-called 'Huber sandwich
>estimator' and 'robust standard errors' in the American Statistican, Vol 60
>(4) 299-302, November 2006. Here is the abstract:
That is a provocative paper indeed, but there are better papers on the
topic floating around too. He raises an important point that if the
point estimates are biased from the ``correct'' model then the whole
model may be irrelevant. So I think his bottom line seems to be
towards dismissing the sandwich standard errors (I don't like the term
"robust" for a number of reasons -- see below) on the grounds that it
distracts people from doing better model diagnostics.
I personally think the value of the sandwich standard errors, and the
whole approach behind them, is different. Three points here.
1. They provide inference when your estimator is not defined through
maximization. Examples are: instrumental variables methods (very wide
in econometrics, and restricted to a couple of Ken Bollen's papers in
social sciences), two-stage estimation (see a discussion a couple of
weeks ago on Murphy-Topel estimator, and the accompanying sandwich
estimator derived by James Hardin).
2. They work for correcting violations of the next order -- if the
mean model is specified correctly, but the variance model is not, you
can rectify your inference with the sandwich errors. (Richard, you
would also find some analogies in structural equation models: if the
model is specified correctly, but the data has extra kurtosis, which
is the next order moment condition violation, then you can apply
Satorra-Bentler correction).
3. There are situations (clustered sampling is necessarily one) where
the sandwich is the only valid way to construct standard errors. The
survey sampling is quite a bit outside of the paradigm under which all
of that sandwich stuff was originally derived (Huber assumed i.i.d. to
begin with), but it is not such a rare thing in applied work -- I
believe that most large surveys would have the classical trio of the
complex designs, stratification + clustering + varying probabilities
of selection.
So, I think his cautions and concerns are pretty valid; robust
standard errors are not a panacea for a model that is "seriously in
error" . But, it still seems to leave open the question of whether
always using robust would be a good idea if your model is "nearly
correct" or hetero is an issue.
Well you can look up econometric practices -- pretty much everybody
reports the sandwich standard errors these days. I am personally
rather supportive of this.
Several additional issues worth mentioning, however.
1. The finite sample perfomance might be disappointing. The usual
sandwich estimator in linear regression model, although consistent
under heteroskedasticity, is biased down even under homoskedasticity;
and it also has variance that depends on the design/structure of the
regressors, unlike the s^2(X'X)^{-1}. (See references below).
2. The term ``robust'' is somewhat misleading, and this is one of the
very few places in Stata where the choice of the term is not so great.
The formal definition of robustness means bounded influence functions
(which are essentially the estimating equations: for OLS, those are
unbounded in residuals, while for median regression, they are, and the
median regression is truly robust, while OLS is not). The sandwich has
terms like e^2 that may burst into infinity in some unhappy
situations, so... no robustness there.
3. The concerns about the bias of the basic model are very true,
although there is an abuse of the coefficient interpretation there. If
you look back at Freedman's article, he substitutes the linear term he
is concerned with from his quadratic model into his linear model. The
parameter that the sandwich standard errors have in mind is the one
that minimizes the objective function for the population; White (1982)
whom Freedman did not care to mention demonstrated that this is those
parameter values deliver the minimum to Kullback-Leibler distance
between the true distribution and the one implied by the parametric
model.
Let's step back to OLS from Freedman's logistic model: there is always
the line of best linear fit for a well-defined population (X are
uniform over [0,10], errors are normal, the trend is quadratic). True,
the redisuals will have a pattern around that line that you don't want
to see; but the sandwich standard errors still provide valid inference
for the coefficients of that best population line. What Freedman is
discussing is different, and of course he won't get the same answer --
his complaint is that essentially the linear coefficient from the
quadratic model is not the same as the linear coefficient from the
best fitting population straight line, which of course is correct.
Hope I made this last point clear. People in population sampling would
know what I mean there :)).
I gave a talk about a week ago locally in our econometrics reading
group, see slides off my webpage below. I can walk you through each
and every slide if you want me to :)). It also has about 4 pages of
references that I would highly recommend to check if you want to
figure this out :))
--
Stas Kolenikov
http://stas.kolenikov.name
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/