# Re: st: Weights in survey design

 From Steven Samuels To statalist@hsphsun2.harvard.edu Subject Re: st: Weights in survey design Date Sun, 18 Mar 2007 22:27:57 -0400

population, the survey design and what were the primary and (possibly) second stage and later sampling units?
In a HH or telephone survey, ordinarily the PSU's would be some kind of geographic areas, and the sampling strata for PSU's cannot be sex, as your setup implies.

Other questions: were the weights post-stratified or raked in any way to reflect the population totals? How did you "reset" the weights?

Steven
On Mar 18, 2007, at 6:46 PM, Jason Ferris wrote:,

I have a large dataset with weights calculated as PPS based on household
size, stratified by sex. The age group respondents are from 16-64.

I am interested in looking at data only from those aged 16-24. I can
use the subpop command "subpop(if age>=16 & age<=24)" for all the
commands. But I am wondering if I can drop all other cases (keep if
age>=16 & age<=24) and the 'reset' my weights based only on those aged
16-24.

In the original form (with all data) I have the following summary data:
(note the survey design is quiet a simple one)

Svyset

pweight: pps

VCE: linearized

Strata 1: sex

SU 1: <observations>

FPC 1: <zero>

. svy: tab sex

(running tabulate on estimation sample)

Number of strata = 2 Number of obs = 8664

Number of PSUs = 8664 Population size = 8664

Design df = 8662

-----------------------

sex | proportions

----------+------------

female | .5046

male | .4954

|

Total | 1

-----------------------

Key: proportions = cell proportions

If I select the subgroup (age 16-24):

. svy,subpop(if age<=24): tab sex

(running tabulate on estimation sample)

Number of strata = 2 Number of obs = 8664

Number of PSUs = 8664 Population size = 8664

Subpop. no. of obs = 999

Subpop. size = 1438.7586

Design df = 8662

-----------------------

sex | proportions

----------+------------

female | .4599

male | .5401

|

Total | 1

-----------------------

Key: proportions = cell proportions

When I reset my weights with data only representing those 16-24 years of
age (ie., as if this was the way I original designed my study) I get the
following results:

. svy: tab sex

(running tabulate on estimation sample)

Number of strata = 2 Number of obs = 999

Number of PSUs = 999 Population size = 999

Design df = 997

-----------------------

sex | proportions

----------+------------

female | .4655

male | .5345

|

Total | 1

-----------------------

Key: proportions = cell proportions

As it can be seen there is now a difference in the proportions between
using subpop and resetting my weights. Is this a problem?

Jason

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
```*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```