Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Using survey weights in regression models

From   Richard Williams <>
Subject   Re: st: Using survey weights in regression models
Date   Thu, 27 Nov 2003 09:11:42 -0500

At 09:19 AM 11/24/2003 -0600, Dick Campbell wrote:
Regarding the recent discussion of how to handle
weights in regression, I have found the following
paper to be very informative.

TI:     Sampling Weights and Regression Analysis
AU:     Winship, Christopher; Radbill, Larry
SO:     Sociological Methods and Research, 1994, 23, 2, Nov, 230-257
I just read this and it is very good. Thanks for recommending it.

A key question I still have: One of its key arguments is that most programs get the standard errors wrong when weighting is used. The article was written in 1994; is it still true of Stata in 2003? In particular, do the svy commands (which I have never used) take care of the concerns raised by the authors?

To paraphrase their key arguments: Where sampling weights are solely a function of IVs included in the model (e.g. minorities are oversampled and race is an IV), unweighted OLS estimates are preferred because they are unbiased, consistent, and have smaller standard errors than weighted OLS estimates (because weighting induces heteroskedasticity in the error terms).

Where sampling weights are a function of the DV (and thus of the error term; oversampling low income groups and using income as a DV would be an example) unweighted estimates will be biased and inconsistent. The authors recommend trying a couple of things before you resort to weighting. But in some cases, weighting will be appropriate, and you should use the White heteroskedastic consistent estimator for the standard errors. In this case, weighting will produce consistent estimates of the true regression slopes; but it will also induce heteroskedasticity in the error terms, which is why you use White's estimator.

There is other good stuff in the article too, e.g. as was mentioned on the list before, they suggest comparing weighted and unweighted results as a check on model specification. Unless I missed it, they don't mention that standardized coefficients and R-square will be wrong if you don't weight, so be aware of that if you like to use such things.

Richard Williams, Associate Professor
OFFICE: (574)631-6668, (574)631-6463
FAX: (574)288-4373
HOME: (574)289-5227
EMAIL: Richard.A.Williams.5@ND.Edu
WWW (personal):
WWW (department):

* For searches and help try:

© Copyright 1996–2022 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index