Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Survey analysis: Why does PSU affect proportions?


From   Craig Anderson <[email protected]>
To   [email protected]
Subject   st: Survey analysis: Why does PSU affect proportions?
Date   Mon, 10 Jul 2006 17:13:22 -0700

I am using a published dataset (from CDC) and I want to verify that I can duplicate the published results before I begin my analysis. I found that including clustering and stratification variables changed proportions, not only standard errors. In the output that follows, I first used only the weights. This gives results that agree with the published estimates of the proportions (produced using SUDAAN). Next I used the PSU and STRATA, and get different proportions, but confidence intervals that agree with the published results, with minor variations in other runs. I don't think the clustering by strata and psu should influence proportions, only their variance.

The dataset documentation says that a three-state cluster sample was used, but only the WEIGHT, PSU, and STRATA are included in the public dataset. The FAQ (Survey-data analysis, first item) says that this sort of svyset statement will produce appropriate variance estimates for multistate designs. Am I doing something wrong?

Craig L. Anderson, PhD (epidemiology)
Department of Emergency Medicine
University of California, Irvine



STATA output:

. clear

. set memory 16M
(16384k)

.
. use yrbs05

. svyset[pw=weight]

pweight: weight
VCE: linearized
Strata 1: <one>
SU 1: <observations>
FPC 1: <zero>

. svy: tab q2 qn9, row se ci
(running tabulate on estimation sample)

Number of strata = 1 Number of obs = 13837
Number of PSUs = 13837 Population size = 13852.064
Design df = 13836

-------------------------------------------------------
what is | never/rarely wore seat belt
your sex | 1 2 Total
----------+--------------------------------------------
female | .0778 .9222 1
| (.004) (.004)
| [.0703,.0861] [.9139,.9297]
|
male | .125 .875 1
| (.0051) (.0051)
| [.1153,.1354] [.8646,.8847]
|
Total | .1017 .8983 1
| (.0033) (.0033)
| [.0954,.1083] [.8917,.9046]
-------------------------------------------------------
Key: row proportions
(linearized standard errors of row proportions)
[95% confidence intervals for row proportions]

Pearson:
Uncorrected chi2(1) = 84.4318
Design-based F(1, 13836) = 52.6547 P = 0.0000

. svyset psu [pw=weight], strata(stratum)

pweight: weight
VCE: linearized
Strata 1: stratum
SU 1: psu
FPC 1: <zero>




. svy: tab q2 qn9, row se ci
(running tabulate on estimation sample)

Number of strata = 4 Number of obs = 13704
Number of PSUs = 51 Population size = 13730.762
Design df = 47

-------------------------------------------------------
what is | never/rarely wore seat belt
your sex | 1 2 Total
----------+--------------------------------------------
female | .0778 .9222 1
| (.0076) (.0076)
| [.0638,.0944] [.9056,.9362]
|
male | .1234 .8766 1
| (.0109) (.0109)
| [.103,.1471] [.8529,.897]
|
Total | .1006 .8994 1
| (.0087) (.0087)
| [.0844,.1195] [.8805,.9156]
-------------------------------------------------------
Key: row proportions
(linearized standard errors of row proportions)
[95% confidence intervals for row proportions]

Pearson:
Uncorrected chi2(1) = 78.7114
Design-based F(1, 47) = 49.8319 P = 0.0000

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index