Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# st: Poststratification weighting, subpop, and missing values

 From To Subject st: Poststratification weighting, subpop, and missing values Date Wed, 26 Sep 2012 09:25:55 -0400

```Hi everyone,
I'm currently working on analyzing the results of a survey and have run into some strange results when using poststratification weights and the subpop modifier.  An example is shown below, where we're simply totaling 2011 sales.  The flag variable indicates the subpopulation we're interested in.  When only limiting the population by flag, the command calculates the total over 2,624 PSUs, while when we try and further limit the population to those with flag equal to one and where total sales is not missing, it calculates over 2,639 PSUs.  In the second command, STATA  seems to be including the 15 missing values in its calculations.   Also, the total for the more limited subpopulation is lower, which does not coincide with what we expect to happen when removing missing values and its effect on the background calculation of the adjusted weight.

Could someone shed some light on why this is happening?

Thank you,
Ricky Ubee

. svyset uniqueID [pweight=weight_prop], strata(strata2) singleunit(scaled) poststrata(type2) postweight(postwt4) fpc(N)

pweight: weight_prop
VCE: linearized
Poststrata: type2
Postweight: postwt4
Single unit: scaled
Strata 1: strata2
SU 1: uniqueID
FPC 1: N

. svy, subpop(if flag==1): total TOT_SALES_11
(running total on estimation sample)

Survey: Total estimation

Number of strata =      26          Number of obs    =    2624
Number of PSUs   =    2624          Population size  =   23794
N. of poststrata =      16          Subpop. no. obs  =     652
Subpop. size     = 5245.94
Design df        =    2598

--------------------------------------------------------------
|             Linearized
|      Total   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
TOT_SALES_11 |   2.20e+12   2.77e+11      1.65e+12    2.74e+12
--------------------------------------------------------------
Note: 2 strata omitted because they contain no subpopulation
members.

. svy, subpop(if flag==1 & TOT_SALES_11~=.): total TOT_SALES_11
(running total on estimation sample)

Survey: Total estimation

Number of strata =      26          Number of obs    =    2639
Number of PSUs   =    2639          Population size  =   23794
N. of poststrata =      16          Subpop. no. obs  =     652
Subpop. size     = 5222.38
Design df        =    2613

--------------------------------------------------------------
|             Linearized
|      Total   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
TOT_SALES_11 |   2.18e+12   2.76e+11      1.64e+12    2.72e+12
--------------------------------------------------------------
Note: 2 strata omitted because they contain no subpopulation
members.

. count if flag==1 & TOT_SALES_11==.
15

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```