Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Weights in survey design


From   Steven Samuels <ssamuels@albany.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Weights in survey design
Date   Sun, 18 Mar 2007 22:27:57 -0400

population, the survey design and what were the primary and (possibly) second stage and later sampling units?
In a HH or telephone survey, ordinarily the PSU's would be some kind of geographic areas, and the sampling strata for PSU's cannot be sex, as your setup implies.

Other questions: were the weights post-stratified or raked in any way to reflect the population totals? How did you "reset" the weights?

Steven
On Mar 18, 2007, at 6:46 PM, Jason Ferris wrote:,


I have a large dataset with weights calculated as PPS based on household
size, stratified by sex. The age group respondents are from 16-64.



I am interested in looking at data only from those aged 16-24. I can
use the subpop command "subpop(if age>=16 & age<=24)" for all the
commands. But I am wondering if I can drop all other cases (keep if
age>=16 & age<=24) and the 'reset' my weights based only on those aged
16-24.



In the original form (with all data) I have the following summary data:
(note the survey design is quiet a simple one)

Svyset



pweight: pps

VCE: linearized

Strata 1: sex

SU 1: <observations>

FPC 1: <zero>



. svy: tab sex

(running tabulate on estimation sample)



Number of strata = 2 Number of obs = 8664

Number of PSUs = 8664 Population size = 8664

Design df = 8662



-----------------------

sex | proportions

----------+------------

female | .5046

male | .4954

|

Total | 1

-----------------------

Key: proportions = cell proportions



If I select the subgroup (age 16-24):

. svy,subpop(if age<=24): tab sex

(running tabulate on estimation sample)



Number of strata = 2 Number of obs = 8664

Number of PSUs = 8664 Population size = 8664

Subpop. no. of obs = 999

Subpop. size = 1438.7586

Design df = 8662



-----------------------

sex | proportions

----------+------------

female | .4599

male | .5401

|

Total | 1

-----------------------

Key: proportions = cell proportions





When I reset my weights with data only representing those 16-24 years of
age (ie., as if this was the way I original designed my study) I get the
following results:



. svy: tab sex

(running tabulate on estimation sample)



Number of strata = 2 Number of obs = 999

Number of PSUs = 999 Population size = 999

Design df = 997



-----------------------

sex | proportions

----------+------------

female | .4655

male | .5345

|

Total | 1

-----------------------

Key: proportions = cell proportions



As it can be seen there is now a difference in the proportions between
using subpop and resetting my weights. Is this a problem?



Jason


*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index