[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"James W. Shaw" <shaw@pharmacy.arizona.edu> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: How to perform Hausman test for random effects specification with survey data |

Date |
Sat, 21 Aug 2004 16:34:07 -0400 |

Mark, I performed Wooldridge's test as specified on p. 291 of his text. Wooldridge's test converges on a certain set of results (F and p values) after four of the time-demeaned coefficients are simultaneously tested. That is, I may include up to four of the time-demeaned variables in the artificial regression, and the test results are always the same regardless of which four are included. Including more than four time-demeaned variables results in variables (either time-demeaned or quasi-demeaned) being dropped from the regression due to multicollinearity. With the test I developed, I directly compare the fixed effects and random effects parameter estimates. This is akin to the traditional version of the Hausman test. I am able to test for differences between the two specifications in up to eight coefficients simultaneously. Regardless of which eight coefficients are tested, I get the same results. The test I developed yields the same results as Wooldridge's test if differences in four of the 12 parameters being estimated are simultaneously tested but converges on a different set of results when eight coefficients are tested. The inference does not change, though (ie, neither the test I developed nor Wooldridge's test rejects the null). This is very interesting, though I am not certain why I should be able to test more coefficients using the method I developed. Based on the results of Wooldridge's test, I think one explanation for why am able to test only a subset of the 12 parameters being estimated is due to collinearity between the quasi-demeaned variables and time-demeaned variables. All of the variables in my model vary both with subject and time. The artificial regression used to perform Wooldridge's test should include quasi-demeaned and time-demeaned versions of each variable; however, only a subset of the latter may be included. I am not sure how I should discuss this in the paper. Specifically, if it is a multicollinearity problem, what should I say the collinearity is between? -- Jim ----- Original Message ----- From: "Mark Schaffer" <M.E.Schaffer@hw.ac.uk> To: <statalist@hsphsun2.harvard.edu>; "James W. Shaw" <shaw@pharmacy.arizona.edu> Cc: "Mark Schaffer" <M.E.Schaffer@hw.ac.uk> Sent: Wednesday, August 18, 2004 4:58 PM Subject: Re: st: How to perform Hausman test for random effects specification with survey data > James, > > This isn't a direct answer to your question, but might be helpful anyway. > > It's possible to implement a version of the Hausman test that is robust to > heteroskedasticity and hence (I think) clustered, probability-weighted > data. > > You can do this by carrying out the artificial regression version of the > test. This seems particularly appropriate in your case since in step (3) > of your estimation below, you are estimating the GLS version via a > regression on the quasi-demeaned data. To do the artificial regression > version of the Hausman test, you run the same regression but include the > time-varying regressors after you have time-demeaned them. The Hausman > test is just a Wald test of the significance of the coeffs on these time- > demeaned additional regressors. > > The convenience of this for your application is that if you estimate this > artificial regression using -robust- and -cluster-, you should get a > Hausman test that is suitable for your clustered, probability weighted > data. > > You can find a full description of this artificial regression test in > Wooldridge's (2002) book, Econometric Analysis of Cross-section and Panel > Data, pp. 290-91. Note that when Wooldridge recommends on p. 291 that the > Wald test is robust to serial correlation as well as heteroskedasticity, > he is in effect recommending using -cluster- together with -robust-. > > Hope this helps. > > Cheers, > Mark > > Quoting "James W. Shaw" <shaw@pharmacy.arizona.edu>: > > > Dear Statalisters: > > > > I have a question about Stata's -suest- command that I hope someone > > may be > > able to answer for me. I have seen it asked by others a few times > > before > > over the past year without any response. > > > > It is my understanding that the Hausman test, which is often used > > to > > evaluate the consistency of the estimates from random effects > > models, cannot > > be used with survey (ie, clustered, probability-weighted) data. I > > was > > wondering if the -suest- command could be used to implement a valid > > version > > of the Hausman test (for comparing random and fixed effects > > specifications) > > for use with survey data. I have done so using the code given at > > the end of > > this message. > > > > Some background first. I have data from a multistage probability > > sample of > > the US population (n=3773) with oversamples of blacks and Hispanics. > > I am > > interested in estimating a design-consistent model allowing for a > > respondent-level random effect. I wish to compare the random > > effects > > specification against the corresponding fixed effects model using > > the > > Hausman test. To estimate the random effects model, I do the > > following: > > > > (1) generate weighted estimates of the variance components > > (2) apply a GLS transform to the data > > (3) estimate the model from the transformed data using -regress- > > > > According to Korn and Graubard, the above procedure may not always > > work. It > > does in my case because I have a large number of sufficiently large > > PSUs. > > The parameter estimates and standard errors I get are equivalent to > > those > > derived when using SUDAAN (which estimates the corresponding > > covariance > > pattern model). > > > > To perform the Hausman test, I do the following: > > > > (1) I concatenate the GLS-transformed and original data using > > -append- > > (2) Using -regress- with the score option, I estimate the random > > effects > > model from the GLS-transformed data and save the estimates > > (3) Using -regress- with the score option, I estimate the fixed > > effects > > model from the original data (including dummies for respondents) and > > save > > the estimates > > (4) I perform the simultaneous estimation using -suest- with the svy > > option > > (5) I perform Hausman's test for the consistency of the random > > effects model > > by testing the difference between the two coefficient vectors > > (excluding the > > constant and fixed effects) > > > > The above procedure seems to work. -suest- gives me the correct > > parameter > > estimates and standard errors for the two models. However, I notice > > that I > > am only able to test for differences in 8 coefficients > > simultaneously. > > There were 12 independent variables in each model (excluding the > > constant > > and respondent dummies in the fixed effects specification). > > Interestingly, > > it does not seem to matter which 8 coefficients I test. I always > > get the > > same statistical result (ie, F and p values). My thought is that > > this must > > somehow be related to the fact that my data are clustered (ie, that > > I am > > allowing for clustering at the level of the PSU). In other words, I > > think > > it may be a peculiarity of my data and that the code I present below > > is > > working correctly. Does this sound plausible? > > > > Any feedback you could provide me with would be greatly appreciated. > > Thank > > you very much. > > > > Regards, > > > > Jim > > > > James W. Shaw, PhD, PharmD, MPH > > Post-Doctoral Fellow > > Tobacco Control Research Branch > > Behavioral Research Program > > Division of Cancer Control and Population Sciences > > National Cancer Institute > > > > > > /* STATA CODE */ > > > > /* GLS TRANSFORM DATA */ > > > > collapse (mean) depvar m1-a2 d1 c3 c32 [pw = ttowgt], by(rti_id) > > ren depvar depvar2 > > ren m1 m12 > > ren m2 m22 > > ren s1 s12 > > ren s2 s22 > > ren u1 u12 > > ren u2 u22 > > ren p1 p12 > > ren p2 p22 > > ren a1 a12 > > ren a2 a22 > > ren c3 c3n > > ren c32 c32n > > sort rti_id > > save "E:\Dissertation\Data\temp1", replace > > use "E:\Dissertation\Data\tempus.dta", clear > > drop _merge > > sort rti_id > > merge rti_id using "E:\Dissertation\Data\temp1" > > > > xtreg depvar m1-a2 c3 c32 [iw = ttowgt], i(rti_id) mle > > > > gen theta = 1 - sqrt(e(sigma_e)^2/(12*e(sigma_u)^2 + > > e(sigma_e)^2)) > > gen depvar3 = depvar - theta*depvar2 > > gen m13 = m1- theta*m12 > > gen m23 = m2 - theta*m22 > > gen s13 = s1 - theta*s12 > > gen s23 = s2 - theta*s22 > > gen u13 = u1- theta*u12 > > gen u23 = u2 - theta*u22 > > gen p13 = p1- theta*p12 > > gen p23 = p2- theta*p22 > > gen a13 = a1 - theta*a12 > > gen a23 = a2- theta*a22 > > gen c33 = c3- theta*c3n > > gen c323 = c32- theta*c32n > > gen one = 1 > > summ one > > scalar omean = r(mean) > > gen one3 = one - theta*omean > > > > /* SAVE TRANSFORMED DATA FOR RANDOM EFFECTS ESTIMATION */ > > > > gen res = 1 > > sort psu rti_id time > > save "E:\Dissertation\Data\temp1", replace > > > > /* RENAME RAW (UNTRANSFORMED) VARIABLES FOR FIXED EFFECTS ESTIMATION > > */ > > > > use "E:\Dissertation\Data\tempus.dta", clear > > ren depvar depvar3 > > ren m1 m13 > > ren m2 m23 > > ren s1 s13 > > ren s2 s23 > > ren u1 u13 > > ren u2 u23 > > ren p1 p13 > > ren p2 p23 > > ren a1 a13 > > ren a2 a23 > > ren c3 c33 > > ren c32 c323 > > gen one3 = 1 > > gen res = 0 > > > > /* APPEND TRANSFORMED DATA TO RAW DATA */ > > > > sort psu rti_id time > > append using "E:\Dissertation\Data\temp1" > > > > /* ESTIMATE RANDOM EFFECTS MODEL */ > > > > svyset [pw = ttowgt], psu(psu) > > reg depvar3 one3 m13-a23 c33 c323 if res == 1 [iw = ttowgt], > > score(RE) > > nocons > > est store RE > > > > /* ESTIMATE FIXED EFFECTS MODEL */ > > > > tab rti_id, gen(id) > > reg depvar3 one3 m13-a23 c33 c323 id2-id3773 if res == 0 [iw = > > ttowgt], > > score(FE) nocons > > est store FE > > > > /* USE -SUEST- TO PERFORM HAUSMAN TEST */ > > > > suest RE FE, svy > > test [RE_mean = FE_mean]: m13 m23 s13 s23 u13 u23 p13 p23 a13 a23 > > c33 c323 > > > > * > > * For searches and help try: > > * http://www.stata.com/support/faqs/res/findit.html > > * http://www.stata.com/support/statalist/faq > > * http://www.ats.ucla.edu/stat/stata/ > > > > > > Prof. Mark Schaffer > Director, CERT > Department of Economics > School of Management & Languages > Heriot-Watt University, Edinburgh EH14 4AS > tel +44-131-451-3494 / fax +44-131-451-3008 > email: m.e.schaffer@hw.ac.uk > web: http://www.sml.hw.ac.uk/ecomes > ________________________________________________________________ > > DISCLAIMER: > > This e-mail and any files transmitted with it are confidential > and intended solely for the use of the individual or entity to > whom it is addressed. If you are not the intended recipient > you are prohibited from using any of the information contained > in this e-mail. In such a case, please destroy all copies in > your possession and notify the sender by reply e-mail. Heriot > Watt University does not accept liability or responsibility > for changes made to this e-mail after it was sent, or for > viruses transmitted through this e-mail. Opinions, comments, > conclusions and other information in this e-mail that do not > relate to the official business of Heriot Watt University are > not endorsed by it. > ________________________________________________________________ > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: How to perform Hausman test for random effects specification with survey data***From:*Mark Schaffer <M.E.Schaffer@hw.ac.uk>

**References**:**st: How to perform Hausman test for random effects specification with survey data***From:*"James W. Shaw" <shaw@pharmacy.arizona.edu>

**Re: st: How to perform Hausman test for random effects specification with survey data***From:*Mark Schaffer <M.E.Schaffer@hw.ac.uk>

- Prev by Date:
**re: st: ivreg2 and panel group heteroscedasticity** - Next by Date:
**re: st: ivreg2 and panel group heteroscedasticity** - Previous by thread:
**Re: st: How to perform Hausman test for random effects specification with survey data** - Next by thread:
**Re: st: How to perform Hausman test for random effects specification with survey data** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |