Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Adjusting Post-Stratification weights for Sample Truncation

From   Steve Samuels <>
Subject   Re: st: Adjusting Post-Stratification weights for Sample Truncation
Date   Sun, 12 May 2013 11:10:53 -0400


• You misunderstand the goal of reweighting. The goal is not to
reweight to the original population but to the part of the population
that meets your selection criteria. It's unlikely enough information
is available to do this.

• Any solution must start with multiple imputation. Excluding over 22%
(400/1800) of your analysis sample because of missing data risks serious
bias. The imputed  data sets could be about 90% of your total
data (1800/2000). 

• With such a large percentage, differences in demographic variables
between the included and full data sets will be small. The exceptions
will be variables used in your selection criteria, e.g. age
restrictions. You can test differences by comparing included and
excluded observations.

• Possible weighting strategies::

1. Use the post-stratified weights as is. This strategy risks bias for
analysis of small subgroups (a general problem for post-stratified

2. Weight by the base weight, and include demographic variables as

3. Do not weight; include demographic variables as predictors.

I'd probably use 1 and 3 and report both.

• To avoid misunderstandings, learn standard terminology.
post-stratification weights are not "probability weights". Probability
weights are the inverse of the design selection probabilities. See pp
347-354 of Groves et al. (2009), a book I highly recommend.



Groves, Robert M., Floyd J. Fowler, Mick P. Couper, James M. Lepkowski,
Eleanor Singer, and Roger Tourangeau. 2009. Survey methodology. Hoboken,
N.J.: Wiley.

> On May 9, 2013, at 9:10 AM, Afif Naeem wrote:
> Hello everyone,
> I sent out this message couple of days ago, but not sure it actually went through. I am sending it again hoping someone is able to help me out here.
> I am working with a data set comprising about 2000 observations on Florida residents. The data set contains information on probability weights to be used with the full sample. The probability weights are calculated through raking procedure so the marginal distributions of the full final sample are optimally fitted to those of the population. Demographic information on eight variables is used as the benchmark for the raking procedure. Base weights controlling for survey design are used as the starting weights in the raking procedure.
> I am pretty sure no clustering was used for sampling the individuals for the survey. And I am not worried about stratification of the sample either. Base weight primarily corrects for several sources of deviation from an equal probability of selection design due to change in recruitment method for the panel maintained by the firm conducting the survey over the years.
> For my regression analysis, I loose about 600 observations; 400 of them due to missing values of various variables used in the analysis, while I loose the remaining 200 observations because I impose a selection criteria of which individuals are to be included in the regression analysis.
> The original post-stratification weights are to be used with the full sample of 2000 observations. My main concern is how to adjust these post-stratification weights so they can be used with my final sample of 1400 observations? Distribution of half of the demographic variables used in the raking procedure change between the full sample of 2000 observations, and my final sample of 1400 observations. I do not have information on some of the demographic variables used in the raking procedure in my data set either. And contacting the firm responsible for conducting the survey is not very feasible as I received the data set through another academic institution which hold the rights to the data. I am wondering if someone can suggest any solution as to how the weights can be adjusted so that they can be used with my final truncated sample of 1400 observations.
> Thanks a lot!
> Afif 		 	   		  
> *
> *   For searches and help try:
> *
> *
> *

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index