Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: xtmixed with vce robust or cluster robust


From   Stas Kolenikov <skolenik@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: xtmixed with vce robust or cluster robust
Date   Mon, 31 Aug 2009 13:27:51 -0500

On Mon, Aug 31, 2009 at 11:49 AM, Schaffer, Mark E<M.E.Schaffer@hw.ac.uk> wrote:
> If I'm estimating using -xtmixed- or -xtreg,mle-, it seems natural to me
> that I'd want to be covered in the usual "robust" way: the equation is
> misspecified enough to mess up the VCE but not enough to make the
> coefficient estimates inconsistent, and by using a robust or
> cluster-robust VCE I can fix the former.
>
> Can you explain what's odd about this in terms that an applied
> econometrician can understand?

Mark,

your intuition is right: you have an M-estimation problem, and there
is nothing in the general theory of these that precludes the sandwich
estimator from working in this instance. What I am GUESSING about
mechanics of -_robust- (and you probably know it better than I do
after writing the -ivreg2- stuff) is that it might be complicated to
force -_robust- to think about cluster-level scores only, as it is
used to operate on the observation level scores. This is kind of
wide-vs-long thing: the wide format with single line per panel would
be exactly what -_robust- looks for, but that's not how Stata wants to
think about every other panel data estimation task -- way easier done
with long data. May be I am totally mistaken here; I never tried to
dig into -_robust- ado-file, so may be you could create a subsetting
variable -bysort cluster (id) : gen byte first = (_n==1)- and subset
the sample -if first- to force -_robust- to only use those first
observations.

A quick -viewsource xtmixed.ado- shows that it is written as a -ml
model d0- estimator. The likelihood is evaluated for the model+data as
a whole, and numeric derivatives are taken by computing the likelihood
at several points and taking the required differences of those single
numbers. See [ML] book (Stata Corp. might consider distributing this
as a part of documentation... maybe?). So no kind of scores are
produced by -xtmixed- at all. I believe -xtreg- works by direct data
matrix manipulations, so it does not necessarily produce scores,
either.

So as a bottom line to the inner applied econometrician sitting inside
Mark is, "No theoretical obstacles; somebody just needs to sit down
and try to (1) get the appropriate scores/estimating equations out of
-xtreg- or -xtmixed-, and (2) code the sandwich estimator, using or
not using the official -_robust-". Given that Stata Corp. did not have
the resources to do this kind of coding when these commands were
released and updated, it may not be as easy as it sounds.

John mentions -gllamm- where -cluster- and -robust- options are
available despite the data being in the long format. -gllamm- is also
implemented as -ml model d0- estimator. For what I know, the sandwich
estimator is hard coded in -gllamm-; that is, Sophia R-H just re-wrote
all the -_robust- formulas... which might have been a relatively small
programming expense compared to numeric integration. So she has done
exactly what I said in the previous paragraph to be "not as easy as it
sounded".

Now, the meaning of -robust- standard errors after -xtmixed- might be
a somewhat of a mystery. With -regress-, the -robust- option is
correcting for heteroskedasticity: you believe you modeled the first
moments right, but not sure about higher order moments (the second
moments, in this case). That's what Mark said: the model is bad, but
not as bad as to kill the point estimates. If you have
heteroskedasticity, your -xtmixed- model is likely wrong in its
variance part, and the variance parameters may not necessarily
correspond to well-defined population parameters. If so, what does the
inference on these point estimates do? Sandwich standard errors might
have a role if you have correctly modeled the first two moments in
your -xtmixed-, but unsure about higher moments, which I believe to be
a relatively peculiar situation for mixed models (although pretty
common to SEM world with Satorra-Bentler corrections).

Same interpretation comment applies to -gllamm-; frankly I don't think
I've ever tried to run it with -robust- option, so I never had to
bother explaining the meaning of sandwich standard errors for -gllamm-
:)).

I am just thinking aloud there; you are welcome to join me if you
like, but I cannot put my finger on anything other than Huber's (1967)
article (http://www.citeulike.org/user/ctacmo/article/553268).

-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index