Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: Degree of freedom in Hausman test and more

From   Mark Schaffer <>
To   Zhehui Luo <>
Subject   st: Re: Degree of freedom in Hausman test and more
Date   Wed, 11 Dec 2002 23:19:28 +0000 (GMT)


Quoting Zhehui Luo <>:

> Dear Listers,
> Question 1:
> I performed the following steps on two data sets with the same
> variables in 
> each case:
> .ivreg y1 x1 (y2=x2)
> .hausman, save
> .reg y1 y2 x1
> .hausman, sigmamore
> Stata gave me different degrees of freedom for each test. In one
> case the 
> dof=number of endogenous variable (y2); in another case the
> dof=number of regressors (endogenous and exogenous, y2 and x1).
> Can anyone tell me what went wrong? I thought with sigmamore option
> Hausman 
> test will give dof=dim(y2).

It's hard to tell without seeing the two examples.  Can you post these so 
we can have a look?

> Question 2:
> My understanding of the above test is that it is comparing all regressors 
> (y2 and x1) between OLS and IV, even though the dof of the test is the 
> dimension of the instrumented variables.

I don't think that's quite right.  If you look at Bowden and Turkington, 
Instrumental Variables (Cambridge University Press, 1984), pp. 50-51, 
you'll see that you can get an identical Hausman test statistic from a 
vector of contrasts involving just the endogenous regressors.

> If I want to compare only the 
> endogenous variables (y2-a scalor) between OLS and IV, under 
> homoskedasticity I can follow Wooldridge (2002) page 120, but under
> heteroskedasticity, what should I do?

ivreg2 will do this.  Using your example above,

ivreg2 y1 x1 y2 (=x2), robust

will give you a C-test statistic that is identical to a heteroskedasticity-
robust Hausman statistic.

> Question 3:
> whitetst and ivhettest seem to require a lot of memory and space. I
> constantly run into errors  even in Stata SE, with 600M of memory 
> specified. Is there a trick here?

My guess is that you are using the default choice of heteroskedasticity 
indicators, which is the full set of exogenous variables, their squares, 
and their *cross-products*.  This can add up to LOTS of variables very 
quickly.  You can cut down on the use of memory (and degrees of freedom) by 
choosing some other indicator variables.  One common choice is to use the 
fitted value of the dependent variable and its square (see the help files 
for whitetst and ivhettest for how to choose this option).

Hope this helps.


> Thanks in advance for any help.
> Zhehui

Prof. Mark Schaffer
Director, CERT
Department of Economics
School of Management & Languages
Heriot-Watt University, Edinburgh EH14 4AS
tel +44-131-451-3494 / fax +44-131-451-3008


This e-mail and any files transmitted with it are confidential
and intended solely for the use of the individual or entity to
whom it is addressed.  If you are not the intended recipient
you are prohibited from using any of the information contained
in this e-mail.  In such a case, please destroy all copies in
your possession and notify the sender by reply e-mail.  Heriot
Watt University does not accept liability or responsibility
for changes made to this e-mail after it was sent, or for
viruses transmitted through this e-mail.  Opinions, comments,
conclusions and other information in this e-mail that do not
relate to the official business of Heriot Watt University are
not endorsed by it.
*   For searches and help try:

© Copyright 1996–2023 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index