Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: How to perform Hausman test for random effects specification with survey data


From   Mark Schaffer <M.E.Schaffer@hw.ac.uk>
To   statalist@hsphsun2.harvard.edu, "James W. Shaw" <shaw@pharmacy.arizona.edu>
Subject   Re: st: How to perform Hausman test for random effects specification with survey data
Date   Sat, 21 Aug 2004 21:54:07 +0100 (BST)

James,

I have a feeling there is something a little fishy going on here.  The 
artificial regression test described by Wooldridge is, if I'm not 
mistaken, *exactly* the traditional Hausman test.

More precisely, if you do the artificial regression version, and test all 
the testable coefficients using a standard (not robust) Wald test, you get 
a Hausman stat that is equivalent to the one you would get using -hausman- 
except that it's guaranteed to be positive-definite because it uses a 
single estimate of the error variance throughout.

I'm a little suspicious of the fact that you can test more coefficients 
using the version of the test you developed.  Of your 12 regressors, how 
many, if any, are time-invariant?

Cheers,
Mark

Quoting "James W. Shaw" <shaw@pharmacy.arizona.edu>:

> Mark,
> 
> I performed Wooldridge's test as specified on p. 291 of his text.
> Wooldridge's test converges on a certain set of results (F and p
> values)
> after four of the time-demeaned coefficients are simultaneously
> tested.
> That is, I may include up to four of the time-demeaned variables in
> the
> artificial regression, and the test results are always the same
> regardless
> of which four are included.  Including more than four
> time-demeaned
> variables results in variables (either time-demeaned or
> quasi-demeaned)
> being dropped from the regression due to multicollinearity.
> 
> With the test I developed, I directly compare the fixed effects and
> random
> effects parameter estimates.  This is akin to the traditional
> version of the
> Hausman test.  I am able to test for differences between the two
> specifications in up to eight coefficients simultaneously. 
> Regardless of
> which eight coefficients are tested, I get the same results.  The
> test I
> developed yields the same results as Wooldridge's test if
> differences in
> four of the 12 parameters being estimated are simultaneously tested
> but
> converges on a different set of results when eight coefficients are
> tested.
> The inference does not change, though (ie, neither the test I
> developed nor
> Wooldridge's test rejects the null).
> 
> This is very interesting, though I am not certain why I should be
> able to
> test more coefficients using the method I developed.  Based on the
> results
> of Wooldridge's test, I think one explanation for why am able to
> test only a
> subset of the 12 parameters being estimated is due to collinearity
> between
> the quasi-demeaned variables and time-demeaned variables.  All of
> the
> variables in my model vary both with subject and time.  The
> artificial
> regression used to perform Wooldridge's test should include
> quasi-demeaned
> and time-demeaned versions of each variable; however, only a subset
> of the
> latter may be included.
> 
> I am not sure how I should discuss this in the paper.  Specifically,
> if it
> is a multicollinearity problem, what should I say the collinearity
> is
> between?
> 
> --
> Jim
> 
> 
> 
> 
> ----- Original Message -----
> From: "Mark Schaffer" <M.E.Schaffer@hw.ac.uk>
> To: <statalist@hsphsun2.harvard.edu>; "James W. Shaw"
> <shaw@pharmacy.arizona.edu>
> Cc: "Mark Schaffer" <M.E.Schaffer@hw.ac.uk>
> Sent: Wednesday, August 18, 2004 4:58 PM
> Subject: Re: st: How to perform Hausman test for random effects
> specification with survey data
> 
> 
> > James,
> >
> > This isn't a direct answer to your question, but might be helpful
> anyway.
> >
> > It's possible to implement a version of the Hausman test that is
> robust to
> > heteroskedasticity and hence (I think) clustered,
> probability-weighted
> > data.
> >
> > You can do this by carrying out the artificial regression version
> of the
> > test.  This seems particularly appropriate in your case since in
> step (3)
> > of your estimation below, you are estimating the GLS version via
> a
> > regression on the quasi-demeaned data.  To do the artificial
> regression
> > version of the Hausman test, you run the same regression but
> include the
> > time-varying regressors after you have time-demeaned them.  The
> Hausman
> > test is just a Wald test of the significance of the coeffs on
> these time-
> > demeaned additional regressors.
> >
> > The convenience of this for your application is that if you
> estimate this
> > artificial regression using -robust- and -cluster-, you should get
> a
> > Hausman test that is suitable for your clustered, probability
> weighted
> > data.
> >
> > You can find a full description of this artificial regression test
> in
> > Wooldridge's (2002) book, Econometric Analysis of Cross-section
> and Panel
> > Data, pp. 290-91.  Note that when Wooldridge recommends on p. 291
> that the
> > Wald test is robust to serial correlation as well as
> heteroskedasticity,
> > he is in effect recommending using -cluster- together with
> -robust-.
> >
> > Hope this helps.
> >
> > Cheers,
> > Mark
> >
> > Quoting "James W. Shaw" <shaw@pharmacy.arizona.edu>:
> >
> > > Dear Statalisters:
> > >
> > > I have a question about Stata's -suest- command that I hope
> someone
> > > may be
> > > able to answer for me.  I have seen it asked by others a few
> times
> > > before
> > > over the past year without any response.
> > >
> > > It is my understanding that the Hausman test, which is often
> used
> > > to
> > > evaluate the consistency of the estimates from random effects
> > > models, cannot
> > > be used with survey (ie, clustered, probability-weighted) data. 
> I
> > > was
> > > wondering if the -suest- command could be used to implement a
> valid
> > > version
> > > of the Hausman test (for comparing random and fixed effects
> > > specifications)
> > > for use with survey data.  I have done so using the code given
> at
> > > the end of
> > > this message.
> > >
> > > Some background first.  I have data from a multistage
> probability
> > > sample of
> > > the US population (n=3773) with oversamples of blacks and
> Hispanics.
> > >  I am
> > > interested in estimating a design-consistent model allowing for
> a
> > > respondent-level random effect.  I wish to compare the random
> > > effects
> > > specification against the corresponding fixed effects model
> using
> > > the
> > > Hausman test.  To estimate the random effects model, I do the
> > > following:
> > >
> > > (1) generate weighted estimates of the variance components
> > > (2) apply a GLS transform to the data
> > > (3) estimate the model from the transformed data using
> -regress-
> > >
> > > According to Korn and Graubard, the above procedure may not
> always
> > > work.  It
> > > does in my case because I have a large number of sufficiently
> large
> > > PSUs.
> > > The parameter estimates and standard errors I get are equivalent
> to
> > > those
> > > derived when using SUDAAN (which estimates the corresponding
> > > covariance
> > > pattern model).
> > >
> > > To perform the Hausman test, I do the following:
> > >
> > > (1) I concatenate the GLS-transformed and original data using
> > > -append-
> > > (2) Using -regress- with the score option, I estimate the
> random
> > > effects
> > > model from the GLS-transformed data and save the estimates
> > > (3) Using -regress- with the score option, I estimate the
> fixed
> > > effects
> > > model from the original data (including dummies for respondents)
> and
> > > save
> > > the estimates
> > > (4) I perform the simultaneous estimation using -suest- with the
> svy
> > > option
> > > (5) I perform Hausman's test for the consistency of the random
> > > effects model
> > > by testing the difference between the two coefficient vectors
> > > (excluding the
> > > constant and fixed effects)
> > >
> > > The above procedure seems to work.  -suest- gives me the
> correct
> > > parameter
> > > estimates and standard errors for the two models.  However, I
> notice
> > > that I
> > > am only able to test for differences in 8 coefficients
> > > simultaneously.
> > > There were 12 independent variables in each model (excluding
> the
> > > constant
> > > and respondent dummies in the fixed effects specification).
> > > Interestingly,
> > > it does not seem to matter which 8 coefficients I test.  I
> always
> > > get the
> > > same statistical result (ie, F and p values).  My thought is
> that
> > > this must
> > > somehow be related to the fact that my data are clustered (ie,
> that
> > > I am
> > > allowing for clustering at the level of the PSU).  In other
> words, I
> > > think
> > > it may be a peculiarity of my data and that the code I present
> below
> > > is
> > > working correctly.  Does this sound plausible?
> > >
> > > Any feedback you could provide me with would be greatly
> appreciated.
> > >  Thank
> > > you very much.
> > >
> > > Regards,
> > >
> > > Jim
> > >
> > > James W. Shaw, PhD, PharmD, MPH
> > > Post-Doctoral Fellow
> > > Tobacco Control Research Branch
> > > Behavioral Research Program
> > > Division of Cancer Control and Population Sciences
> > > National Cancer Institute
> > >
> > >
> > > /* STATA CODE */
> > >
> > > /* GLS TRANSFORM DATA */
> > >
> > > collapse (mean) depvar m1-a2 d1 c3 c32 [pw = ttowgt],
> by(rti_id)
> > > ren depvar depvar2
> > > ren m1 m12
> > > ren m2 m22
> > > ren s1 s12
> > > ren s2 s22
> > > ren u1 u12
> > > ren u2 u22
> > > ren p1 p12
> > > ren p2 p22
> > > ren a1 a12
> > > ren a2 a22
> > > ren c3 c3n
> > > ren c32 c32n
> > > sort rti_id
> > > save "E:\Dissertation\Data\temp1", replace
> > > use "E:\Dissertation\Data\tempus.dta", clear
> > > drop _merge
> > > sort rti_id
> > > merge rti_id using "E:\Dissertation\Data\temp1"
> > >
> > > xtreg depvar m1-a2 c3 c32 [iw = ttowgt], i(rti_id) mle
> > >
> > > gen theta = 1 - sqrt(e(sigma_e)^2/(12*e(sigma_u)^2 +
> > > e(sigma_e)^2))
> > > gen depvar3 = depvar - theta*depvar2
> > > gen m13 = m1- theta*m12
> > > gen m23 = m2 - theta*m22
> > > gen s13 = s1 - theta*s12
> > > gen s23 = s2 - theta*s22
> > > gen u13 = u1- theta*u12
> > > gen u23 = u2 - theta*u22
> > > gen p13 = p1- theta*p12
> > > gen p23 = p2- theta*p22
> > > gen a13 = a1 - theta*a12
> > > gen a23 = a2- theta*a22
> > > gen c33 = c3- theta*c3n
> > > gen c323 = c32- theta*c32n
> > > gen one = 1
> > > summ one
> > > scalar omean = r(mean)
> > > gen one3 = one - theta*omean
> > >
> > > /* SAVE TRANSFORMED DATA FOR RANDOM EFFECTS ESTIMATION */
> > >
> > > gen res = 1
> > > sort psu rti_id time
> > > save "E:\Dissertation\Data\temp1", replace
> > >
> > > /* RENAME RAW (UNTRANSFORMED) VARIABLES FOR FIXED EFFECTS
> ESTIMATION
> > > */
> > >
> > > use "E:\Dissertation\Data\tempus.dta", clear
> > > ren depvar depvar3
> > > ren m1 m13
> > > ren m2 m23
> > > ren s1 s13
> > > ren s2 s23
> > > ren u1 u13
> > > ren u2 u23
> > > ren p1 p13
> > > ren p2 p23
> > > ren a1 a13
> > > ren a2 a23
> > > ren c3 c33
> > > ren c32 c323
> > > gen one3 = 1
> > > gen res = 0
> > >
> > > /* APPEND TRANSFORMED DATA TO RAW DATA */
> > >
> > > sort psu rti_id time
> > > append using "E:\Dissertation\Data\temp1"
> > >
> > > /* ESTIMATE RANDOM EFFECTS MODEL */
> > >
> > > svyset [pw = ttowgt], psu(psu)
> > > reg depvar3 one3 m13-a23 c33 c323 if res == 1 [iw = ttowgt],
> > > score(RE)
> > > nocons
> > > est store RE
> > >
> > > /* ESTIMATE FIXED EFFECTS MODEL */
> > >
> > > tab rti_id, gen(id)
> > > reg depvar3 one3 m13-a23 c33 c323 id2-id3773 if res == 0 [iw =
> > > ttowgt],
> > > score(FE) nocons
> > > est store FE
> > >
> > > /* USE -SUEST- TO PERFORM HAUSMAN TEST */
> > >
> > > suest RE FE, svy
> > > test [RE_mean = FE_mean]: m13 m23 s13 s23 u13 u23 p13 p23 a13
> a23
> > > c33 c323
> > >
> > > *
> > > *   For searches and help try:
> > > *   http://www.stata.com/support/faqs/res/findit.html
> > > *   http://www.stata.com/support/statalist/faq
> > > *   http://www.ats.ucla.edu/stat/stata/
> > >
> >
> >
> >
> > Prof. Mark Schaffer
> > Director, CERT
> > Department of Economics
> > School of Management & Languages
> > Heriot-Watt University, Edinburgh EH14 4AS
> > tel +44-131-451-3494 / fax +44-131-451-3008
> > email: m.e.schaffer@hw.ac.uk
> > web: http://www.sml.hw.ac.uk/ecomes
> > ________________________________________________________________
> >
> > DISCLAIMER:
> >
> > This e-mail and any files transmitted with it are confidential
> > and intended solely for the use of the individual or entity to
> > whom it is addressed.  If you are not the intended recipient
> > you are prohibited from using any of the information contained
> > in this e-mail.  In such a case, please destroy all copies in
> > your possession and notify the sender by reply e-mail.  Heriot
> > Watt University does not accept liability or responsibility
> > for changes made to this e-mail after it was sent, or for
> > viruses transmitted through this e-mail.  Opinions, comments,
> > conclusions and other information in this e-mail that do not
> > relate to the official business of Heriot Watt University are
> > not endorsed by it.
> > ________________________________________________________________
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/support/faqs/res/findit.html
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> >
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 



Prof. Mark Schaffer
Director, CERT
Department of Economics
School of Management & Languages
Heriot-Watt University, Edinburgh EH14 4AS
tel +44-131-451-3494 / fax +44-131-451-3008
email: m.e.schaffer@hw.ac.uk
web: http://www.sml.hw.ac.uk/ecomes
________________________________________________________________

DISCLAIMER:

This e-mail and any files transmitted with it are confidential
and intended solely for the use of the individual or entity to
whom it is addressed.  If you are not the intended recipient
you are prohibited from using any of the information contained
in this e-mail.  In such a case, please destroy all copies in
your possession and notify the sender by reply e-mail.  Heriot
Watt University does not accept liability or responsibility
for changes made to this e-mail after it was sent, or for
viruses transmitted through this e-mail.  Opinions, comments,
conclusions and other information in this e-mail that do not
relate to the official business of Heriot Watt University are
not endorsed by it.
________________________________________________________________
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index