[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"James W. Shaw" <[email protected]> |

To |
<[email protected]> |

Subject |
st: How to perform Hausman test for random effects specification with survey data |

Date |
Wed, 18 Aug 2004 12:38:24 -0400 |

Dear Statalisters: I have a question about Stata's -suest- command that I hope someone may be able to answer for me. I have seen it asked by others a few times before over the past year without any response. It is my understanding that the Hausman test, which is often used to evaluate the consistency of the estimates from random effects models, cannot be used with survey (ie, clustered, probability-weighted) data. I was wondering if the -suest- command could be used to implement a valid version of the Hausman test (for comparing random and fixed effects specifications) for use with survey data. I have done so using the code given at the end of this message. Some background first. I have data from a multistage probability sample of the US population (n=3773) with oversamples of blacks and Hispanics. I am interested in estimating a design-consistent model allowing for a respondent-level random effect. I wish to compare the random effects specification against the corresponding fixed effects model using the Hausman test. To estimate the random effects model, I do the following: (1) generate weighted estimates of the variance components (2) apply a GLS transform to the data (3) estimate the model from the transformed data using -regress- According to Korn and Graubard, the above procedure may not always work. It does in my case because I have a large number of sufficiently large PSUs. The parameter estimates and standard errors I get are equivalent to those derived when using SUDAAN (which estimates the corresponding covariance pattern model). To perform the Hausman test, I do the following: (1) I concatenate the GLS-transformed and original data using -append- (2) Using -regress- with the score option, I estimate the random effects model from the GLS-transformed data and save the estimates (3) Using -regress- with the score option, I estimate the fixed effects model from the original data (including dummies for respondents) and save the estimates (4) I perform the simultaneous estimation using -suest- with the svy option (5) I perform Hausman's test for the consistency of the random effects model by testing the difference between the two coefficient vectors (excluding the constant and fixed effects) The above procedure seems to work. -suest- gives me the correct parameter estimates and standard errors for the two models. However, I notice that I am only able to test for differences in 8 coefficients simultaneously. There were 12 independent variables in each model (excluding the constant and respondent dummies in the fixed effects specification). Interestingly, it does not seem to matter which 8 coefficients I test. I always get the same statistical result (ie, F and p values). My thought is that this must somehow be related to the fact that my data are clustered (ie, that I am allowing for clustering at the level of the PSU). In other words, I think it may be a peculiarity of my data and that the code I present below is working correctly. Does this sound plausible? Any feedback you could provide me with would be greatly appreciated. Thank you very much. Regards, Jim James W. Shaw, PhD, PharmD, MPH Post-Doctoral Fellow Tobacco Control Research Branch Behavioral Research Program Division of Cancer Control and Population Sciences National Cancer Institute /* STATA CODE */ /* GLS TRANSFORM DATA */ collapse (mean) depvar m1-a2 d1 c3 c32 [pw = ttowgt], by(rti_id) ren depvar depvar2 ren m1 m12 ren m2 m22 ren s1 s12 ren s2 s22 ren u1 u12 ren u2 u22 ren p1 p12 ren p2 p22 ren a1 a12 ren a2 a22 ren c3 c3n ren c32 c32n sort rti_id save "E:\Dissertation\Data\temp1", replace use "E:\Dissertation\Data\tempus.dta", clear drop _merge sort rti_id merge rti_id using "E:\Dissertation\Data\temp1" xtreg depvar m1-a2 c3 c32 [iw = ttowgt], i(rti_id) mle gen theta = 1 - sqrt(e(sigma_e)^2/(12*e(sigma_u)^2 + e(sigma_e)^2)) gen depvar3 = depvar - theta*depvar2 gen m13 = m1- theta*m12 gen m23 = m2 - theta*m22 gen s13 = s1 - theta*s12 gen s23 = s2 - theta*s22 gen u13 = u1- theta*u12 gen u23 = u2 - theta*u22 gen p13 = p1- theta*p12 gen p23 = p2- theta*p22 gen a13 = a1 - theta*a12 gen a23 = a2- theta*a22 gen c33 = c3- theta*c3n gen c323 = c32- theta*c32n gen one = 1 summ one scalar omean = r(mean) gen one3 = one - theta*omean /* SAVE TRANSFORMED DATA FOR RANDOM EFFECTS ESTIMATION */ gen res = 1 sort psu rti_id time save "E:\Dissertation\Data\temp1", replace /* RENAME RAW (UNTRANSFORMED) VARIABLES FOR FIXED EFFECTS ESTIMATION */ use "E:\Dissertation\Data\tempus.dta", clear ren depvar depvar3 ren m1 m13 ren m2 m23 ren s1 s13 ren s2 s23 ren u1 u13 ren u2 u23 ren p1 p13 ren p2 p23 ren a1 a13 ren a2 a23 ren c3 c33 ren c32 c323 gen one3 = 1 gen res = 0 /* APPEND TRANSFORMED DATA TO RAW DATA */ sort psu rti_id time append using "E:\Dissertation\Data\temp1" /* ESTIMATE RANDOM EFFECTS MODEL */ svyset [pw = ttowgt], psu(psu) reg depvar3 one3 m13-a23 c33 c323 if res == 1 [iw = ttowgt], score(RE) nocons est store RE /* ESTIMATE FIXED EFFECTS MODEL */ tab rti_id, gen(id) reg depvar3 one3 m13-a23 c33 c323 id2-id3773 if res == 0 [iw = ttowgt], score(FE) nocons est store FE /* USE -SUEST- TO PERFORM HAUSMAN TEST */ suest RE FE, svy test [RE_mean = FE_mean]: m13 m23 s13 s23 u13 u23 p13 p23 a13 a23 c33 c323 * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:

- Prev by Date:
**Re: st: selecting obs while reading in huge data set** - Next by Date:
**st: 2SLS and eivreg** - Previous by thread:
**st: question on GLLAMM** - Next by thread:
**Re: st: How to perform Hausman test for random effects specification with survey data** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |