# Re: st: hausman and xthausman after panel fe, re - DROPPED MEAN/DIFF

 From Joana Quina To statalist@hsphsun2.harvard.edu Subject Re: st: hausman and xthausman after panel fe, re - DROPPED MEAN/DIFF Date Thu, 25 Aug 2005 09:14:00 +0100

```Dear Vince,

I used your code for the augmented Hausman test, but whenever I
include time-invariant variables (or time dummies interacted with
time-invariant variables), it does note work - Stata drops the mean
for the time-invariant variable (or the diff for the interacted
terms). I notice from Eric's varlist that there seem to be
time-invariant variables (latin/ssa) - is that correct? Any help would
be much appreciated.

Thanks,
Joana

On 23/08/05, Vince Wiggins, StataCorp <vwiggins@stata.com> wrote:
> Carl Nelson <chnelson3@insightbb.com> asks why he gets different results from
> the -hausman- command and the deprecated -xthausman- command.
>
> > This question concerns problem 10.9 in Jeff Wooldridge's book
> > Econometric Analysis of Cross Section and Panel Data. In this
> > exercise, which I gave to some students in a course this summer,
> > using Cornwell.dat students are asked to estimate xtreg, fe and
> > xtreg, re and perform the hausman test.  Using the old xthausman
> > syntax the result is a significant test statistic (approximately 121
> > for a chisquared(11) rv). Using the newer hausman syntax the result
> > is a negative chisquared statistic and warning about violation of
> > assumptions.  I constructed the statistic from the saved results
> > e(b) and e(V) and I got the same result as the newer hausman syntax.
> > [...]
>
> It is rare that -hausman- and -xthausman- produce different statistics, but I
> recommend that Carl believe the results from -hausman- and not -xthausman-.
> The main reason -xthausman- was undocumented (and now works only under version
> control) was that that it could be fooled by non positive definite (PD)
> differenced covariance matrices or by variables with degenerate panel
> behavior.
>
> I posted a rather lengthy discussion of the issues back in March of 2002.
> This post predates some of the statalist archives, so at the risk of being
> long-winded yet again, let me quote from that posting.
>
> ---------------------------------- Begin excerpts --------------------------
>
> Eric Neumayer <E.Neumayer@lse.ac.uk> asks why he is getting different results
> from -xthaus- and -hausman- when testing for fixed vs. random effects after
> estimation with -xtreg-. [...]
>
> I believe there are open questions about Hausman tests in situations like
> Eric's, see the explanation that follows.
>
>
> Preliminaries
> -------------
>
> It is hard to discuss the Hausman test without being specific about how the
> test is performed.  Let B be the parameter estimates from a fully efficient
> estimator (random-effects regression in this case) and b be the estimates from
> a less efficient estimator (fixed-effects regression), but one that is
> consistent in the face of one or more violated assumptions, in this case that
> the effects are correlated with one or more of the regressors.  If the
> assumption is violated then we expect that the estimates from the two
> estimators will not be the same, b~=B.
>
> The Hausman test is essentially a Wald test that (b-B)==0 for all coefficients
> where the covariance matrix for b-B is taken as the difference of the
> covariance matrices (VCEs) for b and B.  What is amazing about the test is
> that we can just subtract these two covariance matrices to get an estimate of
> the covariance matrix of (b-B) without even considering that the VCEs of the
> two estimators might be correlated -- they are after all estimated on the same
> data.  We can just subtract, but only because the the VCE of the fully
> efficient estimator is uncorrelated with the VCEs of all other estimators, see
> Hausman and Taylor (1981), "panel data and unobservable individual effects",
> econometrica, 49, 1337-1398).  The VCE of the efficient estimator will also be
> smaller than the less efficient estimator.  Taken together, these results
> imply that the subtraction of the two VCE (V_b-V_B) will be positive definite
> (PD) and that we need not consider the covariance between the two VCEs.
>
> These results, however, hold only asymptotically.  For any given finite sample
> we have no reason to believe that (V_b-V_B) will be PD.  So, it is amazing
> that we can just subtract these two matrices, but the price we pay is that we
> can only do so safely if we have an infinite amount of data.  The Hausman
> test, unlike most tests, relies on asymptotic arguments not only for its
> distribution, but for its ability to be computed!  Let's discuss what we do
> what we do when (V_b-V_B) in not PD in the context of Eric's results.
>
> Aside:  If anyone is interested in a Hausman-like test that drops the
> assumption that either estimator is fully efficient, actually estimates the
> covariance between the VCEs, and can always be computed, see Weesie (2000)
> "Seemingly unrelated est. and cluster-adjusted sandwich estimator", STB
> Reprints Vol 9, pp 231-248.  The test unfortunately requires the scores from
> the estimator, and -xtreg, fe- does not directly produce these.
>
> <Note, a version of -suest- command is now official, but is still unavailable
>  after -xtreg->
>
>
> Of Inverses and Hausman Statistics
> ----------------------------------
>
> The reason that -xthaus- and -hausman- produce different statistics on Eric's
> models is that they take different inverses of this non-PD matrix.  -xthaus-
> uses Stata's -syminv()- which zeros out columns and rows to form a sub-matrix
> that is PD and inverts that matrix, whereas -hausman- uses a Moore-Penrose
> generalized inverse.  Most of the literature on Hausman tests suggests that a
> generalized inverse such as Moore-Penrose be used when the matrix is not PD,
> however, I have not seen a foundation of this suggestion (and would
> appreciation a reference if anyone knows of one).
>
> Two of us at Stata have independently run some informal simulations, where
> non-PD matrices are common, to determine if either of these inverses has
> nominal coverage for a true null.  While these simulations are not complete
> enough to share or publish, we both found that neither inverse performs well.
> This doesn't seem too surprising to me, if the information in our sample is
> insufficient to produce a PD "VCE" then the basis of the test would seem to be
> in question.
>
> -xthaus- does not make it clear when the matrix is not PD.  I recall having
> read, though I cannot now find the reference, that in the case of random vs.
> fixed effects that the matrix was either always PD.  This may have been the
> thinking in excluding this check from -xthausman-.  Regardless, it is clearly
> not impossible and is not even unlikely.  Simulations show that non-PD
> matrices are quite common.
>
>
> An Alternative
> --------------
>
> Even in their early work, Hausman and Taylor (1981) discuss an asymptotically
> equivalent test for random vs. fixed effects using an augmented regression.
> There are actually several forms of the augmented regression, all of which are
> asymptotically equivalent to the Hausman test.  All of these augmented
> regression tests are based on estimating an augmented regression that nests
> both the random- and fixed-effects models.  They are parameterized in such a
> way that we can perform a simple Wald test of a set of the jointly estimated
> coefficients.  They have fewer of the mechanical and interpretation problems
> associated with the Hausman test.  Their results will differ numerically from
> the Hausman test in finite samples because they are only asymptotically
> equivalent.
>
> I have include below a block of code that will perform an augmented regression
> test for Eric's model (it also performs the Hausman test using -xthaus- and
> -hausman-).  It can easily be adapted to any model by changing the depvar and
> varlist macros.
>
> If I have given the impression that I don't much care for the Hausman test,
> good.  I don't.  In ad hoc simulations I have found that in addition to its
> proclivity to be uncomputable, the test has low power for the current problem,
> for tests of endogeneity in instrumental variables regression, and for tests
> of independence of irrelevant alternatives (IIA) in choice models.
>
> Regardless, the test is a staple in econometrics and it will stay in Stata.
>
>
> <Note:  Carl should be able to easily adapt this code by specifying the id
>  variable, dependent variable, and varlist.>
>
> ---------------------------------- BEGIN --- foreric.do --- CUT HERE -------
> local id myid
> local depvar lnuncs
> local varlist lngdp ecrise ecfall urban lnhouse femalepa male1544        /*
>        */ lndiscr lnfree lnpts latin ssa deathp rulelaw protest cathol  /*
>        */ muslim transiti lnethv oecd war year89 year92 year95
>
> xtreg `depvar' `varlist', re
> hausman, save
> version 7: xthausman
>
> xtreg `depvar' `varlist', fe
> hausman, less
>
> tokenize `varlist'
> local i 1
> while "``i''" != "" {
>        qui by `id':  gen double mean`i' = sum(``i'') / _n
>        qui by `id':  replace mean`i' = mean`i'[_N]
>        qui by `id':  gen double diff`i' = ``i'' - mean`i'
>        local newlist `newlist' mean`i' diff`i'
>
>        local i = `i' + 1
> }
>
> xtreg `depvar' `newlist' , re
> tempname b
> matrix `b' = e(b)
>
> qui test mean1 = diff1 , notest         /* clear test */
> local i 2
> while "``i''" != "" {
>        if `b'[1,colnumb(`b', "mean`i'")] != 0 &        /*
>        */ `b'[1,colnumb(`b', "diff`i'")]  != 0 {
>                qui test mean`i' = diff`i' , accum notest
>        }
>        local i = `i' + 1
> }
> test
>
> ----------------------------------   END --- foreric.do --- CUT HERE -------
>
> ---------------------------------- End   excerpts --------------------------
>
> As noted in the excerpt, When -xthausman- was written we were swayed by
> published "proofs" that the difference matrix was required mathematically to
> be positive definite when comparing FE and RE linear regression.  As Eric's
> and Carl's examples show, this is not true.  I would like to thank Mark
> Schaffer <M.E.Schaffer@hw.ac.uk> for reminding me of one of the "proofs",
>
>
>        "This appendix proves that the Avar(q_hat) in (5.2.21) is
>        positive definite and the Hausman statistic (5.2.22) is
>        guaranteed to be nonnegative in any finite samples."
>
>        Hayashi, Econometrics (2000), Appendix 5.A, pp. 346-349 and 334-335.
>
> To avoid breaking user's do-files, we were reluctant to remove -xthausman-
> when -hausman- was first introduced.  Sufficient time has passed, and as of
> version 9 of Stata, -xthausman- works only when your version is set to 8 or
> lower.
>
>
> -- Vince
>   vwiggins@stata.com
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```