Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Comparing weighted and unweighted distributions via Chi2 test

From   Steven Samuels <>
Subject   Re: st: Comparing weighted and unweighted distributions via Chi2 test
Date   Thu, 9 Jun 2011 10:47:42 -0400


Response weights are intended to reduce bias. There are a number of circumstances in which I would not apply them, at least not unaltered. A p-value>0.05 for comparing original and response-weighted estimates is not one of those circumstances.

One circumstance is when the estimated root MSE sqrt(bias^2 + se^2) for the originally-weighted estimate is less than the standard error for the response-weighted estimate. (The assumption is that the response-weighted estimate is less-biased.) 

I include code  at the end that estimates RMSEs of a mean for the original weights and for the response weights. In this artificial example, the response weighted SE itself is smaller than the original SE, which would not always be the case. Following that is a calculation of the CI for the difference using -svy: reg-.  Here I double the number of observations, but, because the PSUs remain the same, the standard errors are valid. You can set up chi square tests in the same way: include wt_version as a category.

Note that a two-sample t-test based CI calculated from separate estimates and SEs would be wrong because both estimates were calculated from the same data.  


scalar drop _all
sysuse auto, clear
gen orig_wt = turn
/* Original Weights */
svyset rep78 [pw= orig_wt]
svy: mean mpg

scalar m1 =r(b)
scalar  se1= r(se)
gen wt_version =1
tempfile t1
save `t1'
/* Response Weighting to reduce Bias */
gen resp_wt = length  //non-response weight
svyset rep78 [pw=resp_wt]
svy: mean mpg
scalar m2 = r(b)
scalar se2 =r(se)

// RMSE1  for original weighting
scalar rmse1 = sqrt((m1 - m2)^2 + se1^2)

scalar list  m1 m2 se1 rmse1 se2  //compare last two

/* Assess difference in the two means */
replace wt_version =2
append using `t1'
gen wt_new = orig_wt if wt_version==1
replace wt_new = resp_wt if wt_version==2
svyset rep78 [pw=wt_new]
xi: svy:  reg mpg i.wt_version  //test of wt_version is difference

On Jun 8, 2011, at 5:05 PM, Duru wrote:

Dear all,

In order to test if my nonresponse weights change my survey outcomes
substantially, I want to conduct Chi2 tests or t-tests between
weighted and unweighted distributions/means for a number of variables.
Since, I dont know a practical way to do this on Stata, I have to
insert weighted and unweighted frequency tables or calculate t-values
manually from weighted and unweighted mean and variance estimates. Any
ideas on how to do it more easily? (using Stata 10.1)



*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index