[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: How to perform Hausman test for random effects specification with survey data

From	Mark Schaffer <[email protected]>
To	[email protected], "James W. Shaw" <[email protected]>
Subject	Re: st: How to perform Hausman test for random effects specification with survey data
Date	Wed, 18 Aug 2004 21:58:19 +0100 (BST)

James,

This isn't a direct answer to your question, but might be helpful anyway.

It's possible to implement a version of the Hausman test that is robust to 
heteroskedasticity and hence (I think) clustered, probability-weighted 
data.

You can do this by carrying out the artificial regression version of the 
test.  This seems particularly appropriate in your case since in step (3) 
of your estimation below, you are estimating the GLS version via a 
regression on the quasi-demeaned data.  To do the artificial regression 
version of the Hausman test, you run the same regression but include the 
time-varying regressors after you have time-demeaned them.  The Hausman 
test is just a Wald test of the significance of the coeffs on these time-
demeaned additional regressors.

The convenience of this for your application is that if you estimate this 
artificial regression using -robust- and -cluster-, you should get a 
Hausman test that is suitable for your clustered, probability weighted 
data.

You can find a full description of this artificial regression test in 
Wooldridge's (2002) book, Econometric Analysis of Cross-section and Panel 
Data, pp. 290-91.  Note that when Wooldridge recommends on p. 291 that the 
Wald test is robust to serial correlation as well as heteroskedasticity, 
he is in effect recommending using -cluster- together with -robust-.

Hope this helps.

Cheers,
Mark

Quoting "James W. Shaw" <[email protected]>:

> Dear Statalisters:
> 
> I have a question about Stata's -suest- command that I hope someone
> may be
> able to answer for me.  I have seen it asked by others a few times
> before
> over the past year without any response.
> 
> It is my understanding that the Hausman test, which is often used
> to
> evaluate the consistency of the estimates from random effects
> models, cannot
> be used with survey (ie, clustered, probability-weighted) data.  I
> was
> wondering if the -suest- command could be used to implement a valid
> version
> of the Hausman test (for comparing random and fixed effects
> specifications)
> for use with survey data.  I have done so using the code given at
> the end of
> this message.
> 
> Some background first.  I have data from a multistage probability
> sample of
> the US population (n=3773) with oversamples of blacks and Hispanics.
>  I am
> interested in estimating a design-consistent model allowing for a
> respondent-level random effect.  I wish to compare the random
> effects
> specification against the corresponding fixed effects model using
> the
> Hausman test.  To estimate the random effects model, I do the
> following:
> 
> (1) generate weighted estimates of the variance components
> (2) apply a GLS transform to the data
> (3) estimate the model from the transformed data using -regress-
> 
> According to Korn and Graubard, the above procedure may not always
> work.  It
> does in my case because I have a large number of sufficiently large
> PSUs.
> The parameter estimates and standard errors I get are equivalent to
> those
> derived when using SUDAAN (which estimates the corresponding
> covariance
> pattern model).
> 
> To perform the Hausman test, I do the following:
> 
> (1) I concatenate the GLS-transformed and original data using
> -append-
> (2) Using -regress- with the score option, I estimate the random
> effects
> model from the GLS-transformed data and save the estimates
> (3) Using -regress- with the score option, I estimate the fixed
> effects
> model from the original data (including dummies for respondents) and
> save
> the estimates
> (4) I perform the simultaneous estimation using -suest- with the svy
> option
> (5) I perform Hausman's test for the consistency of the random
> effects model
> by testing the difference between the two coefficient vectors
> (excluding the
> constant and fixed effects)
> 
> The above procedure seems to work.  -suest- gives me the correct
> parameter
> estimates and standard errors for the two models.  However, I notice
> that I
> am only able to test for differences in 8 coefficients
> simultaneously.
> There were 12 independent variables in each model (excluding the
> constant
> and respondent dummies in the fixed effects specification). 
> Interestingly,
> it does not seem to matter which 8 coefficients I test.  I always
> get the
> same statistical result (ie, F and p values).  My thought is that
> this must
> somehow be related to the fact that my data are clustered (ie, that
> I am
> allowing for clustering at the level of the PSU).  In other words, I
> think
> it may be a peculiarity of my data and that the code I present below
> is
> working correctly.  Does this sound plausible?
> 
> Any feedback you could provide me with would be greatly appreciated.
>  Thank
> you very much.
> 
> Regards,
> 
> Jim
> 
> James W. Shaw, PhD, PharmD, MPH
> Post-Doctoral Fellow
> Tobacco Control Research Branch
> Behavioral Research Program
> Division of Cancer Control and Population Sciences
> National Cancer Institute
> 
> 
> /* STATA CODE */
> 
> /* GLS TRANSFORM DATA */
> 
> collapse (mean) depvar m1-a2 d1 c3 c32 [pw = ttowgt], by(rti_id)
> ren depvar depvar2
> ren m1 m12
> ren m2 m22
> ren s1 s12
> ren s2 s22
> ren u1 u12
> ren u2 u22
> ren p1 p12
> ren p2 p22
> ren a1 a12
> ren a2 a22
> ren c3 c3n
> ren c32 c32n
> sort rti_id
> save "E:\Dissertation\Data\temp1", replace
> use "E:\Dissertation\Data\tempus.dta", clear
> drop _merge
> sort rti_id
> merge rti_id using "E:\Dissertation\Data\temp1"
> 
> xtreg depvar m1-a2 c3 c32 [iw = ttowgt], i(rti_id) mle
> 
> gen theta = 1 - sqrt(e(sigma_e)^2/(12*e(sigma_u)^2 +
> e(sigma_e)^2))
> gen depvar3 = depvar - theta*depvar2
> gen m13 = m1- theta*m12
> gen m23 = m2 - theta*m22
> gen s13 = s1 - theta*s12
> gen s23 = s2 - theta*s22
> gen u13 = u1- theta*u12
> gen u23 = u2 - theta*u22
> gen p13 = p1- theta*p12
> gen p23 = p2- theta*p22
> gen a13 = a1 - theta*a12
> gen a23 = a2- theta*a22
> gen c33 = c3- theta*c3n
> gen c323 = c32- theta*c32n
> gen one = 1
> summ one
> scalar omean = r(mean)
> gen one3 = one - theta*omean
> 
> /* SAVE TRANSFORMED DATA FOR RANDOM EFFECTS ESTIMATION */
> 
> gen res = 1
> sort psu rti_id time
> save "E:\Dissertation\Data\temp1", replace
> 
> /* RENAME RAW (UNTRANSFORMED) VARIABLES FOR FIXED EFFECTS ESTIMATION
> */
> 
> use "E:\Dissertation\Data\tempus.dta", clear
> ren depvar depvar3
> ren m1 m13
> ren m2 m23
> ren s1 s13
> ren s2 s23
> ren u1 u13
> ren u2 u23
> ren p1 p13
> ren p2 p23
> ren a1 a13
> ren a2 a23
> ren c3 c33
> ren c32 c323
> gen one3 = 1
> gen res = 0
> 
> /* APPEND TRANSFORMED DATA TO RAW DATA */
> 
> sort psu rti_id time
> append using "E:\Dissertation\Data\temp1"
> 
> /* ESTIMATE RANDOM EFFECTS MODEL */
> 
> svyset [pw = ttowgt], psu(psu)
> reg depvar3 one3 m13-a23 c33 c323 if res == 1 [iw = ttowgt],
> score(RE)
> nocons
> est store RE
> 
> /* ESTIMATE FIXED EFFECTS MODEL */
> 
> tab rti_id, gen(id)
> reg depvar3 one3 m13-a23 c33 c323 id2-id3773 if res == 0 [iw =
> ttowgt],
> score(FE) nocons
> est store FE
> 
> /* USE -SUEST- TO PERFORM HAUSMAN TEST */
> 
> suest RE FE, svy
> test [RE_mean = FE_mean]: m13 m23 s13 s23 u13 u23 p13 p23 a13 a23
> c33 c323
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 



Prof. Mark Schaffer
Director, CERT
Department of Economics
School of Management & Languages
Heriot-Watt University, Edinburgh EH14 4AS
tel +44-131-451-3494 / fax +44-131-451-3008
email: [email protected]
web: http://www.sml.hw.ac.uk/ecomes
________________________________________________________________

DISCLAIMER:

This e-mail and any files transmitted with it are confidential
and intended solely for the use of the individual or entity to
whom it is addressed.  If you are not the intended recipient
you are prohibited from using any of the information contained
in this e-mail.  In such a case, please destroy all copies in
your possession and notify the sender by reply e-mail.  Heriot
Watt University does not accept liability or responsibility
for changes made to this e-mail after it was sent, or for
viruses transmitted through this e-mail.  Opinions, comments,
conclusions and other information in this e-mail that do not
relate to the official business of Heriot Watt University are
not endorsed by it.
________________________________________________________________
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: How to perform Hausman test for random effects specification with survey data
  - From: "James W. Shaw" <[email protected]>
- Re: st: How to perform Hausman test for random effects specification with survey data
  - From: "James W. Shaw" <[email protected]>

References:
- st: How to perform Hausman test for random effects specification with survey data
  - From: "James W. Shaw" <[email protected]>

Prev by Date: st: generating data sets with specific parameters
Next by Date: st: RE: Seasonal adjustment?
Previous by thread: st: How to perform Hausman test for random effects specification with survey data
Next by thread: Re: st: How to perform Hausman test for random effects specification with survey data
Index(es):
- Date
- Thread