help svyset dialog: svyset
-------------------------------------------------------------------------------
Title
[SVY] svyset -- Declare survey design for dataset
Syntax
Single-stage syntax
svyset [psu] [weight] [, design_options options]
Multiple-stage syntax
svyset psu [weight] [, design_options] [|| ssu , design_options] ...
[options]
Clear the current settings
svyset, clear
Report the current settings
svyset
design_options description
-------------------------------------------------------------------------
Main
strata(varname) variable identifying strata
fpc(varname) finite population correction
-------------------------------------------------------------------------
options description
-------------------------------------------------------------------------
Weights
brrweight(varlist) balanced repeated replicate (BRR) weights
fay(#) Fay's adjustment
jkrweight(varlist, ...) jackknife replicate weights
SE
vce(linearized) Taylor linearized variance estimation
vce(brr) balanced repeated replication (BRR) variance
estimation
vce(jackknife) jackknife variance estimation
mse use the MSE formula with vce(brr) or
vce(jackknife)
singleunit(method) strata with a single sampling unit; method
may be missing, certainty, scaled, or
centered
Poststratification
poststrata(varname) variable identifying poststrata
postweight(varname) poststratum population sizes
+ clear clear all settings from the data
+ noclear change some of the settings without clearing
the others
+ clear(opnames) clear the specified settings without clearing
all others; opnames may be one or more of
weight, vce, mse, brrweight, jkrweight, or
poststrata
-------------------------------------------------------------------------
+ clear, noclear, and clear() are not shown in the dialog box.
pweights and iweights are allowed; see weights.
The full specification for jkrweight() is
jkrweight(varlist [, stratum(# [# ...]) fpc(# [# ...]) multiplier(#
[# ...]) reset ])
Menu
Statistics > Survey data analysis > Setup and utilities > Declare survey
design for dataset
Description
svyset declares the data to be complex survey data, designates variables
that contain information about the survey design, and specifies the
default method for variance estimation. You must svyset your data before
using any svy command; see [SVY] svy estimation.
psu is _n or the name of a variable (numeric or string) that contains
identifiers for the primary sampling units (clusters). Use _n to
indicate that individuals (instead of clusters) were randomly sampled if
the design does not involve clustered sampling. In the single-stage
syntax, psu is optional and defaults to _n.
ssu is _n or the name of a variable (numeric or string) that contains
identifiers for sampling units (clusters) at the subsequent stages of the
survey design. Use _n to indicate that individuals were randomly sampled
within the last sampling stage.
Settings made by svyset are saved with a dataset. So, if a dataset is
saved after it has been svyset, it does not have to be set again.
The current settings are reported when svyset is called without
arguments:
. svyset
Use the clear option to remove the current settings:
. svyset, clear
Options
+------+
----+ Main +-------------------------------------------------------------
strata(varname) specifies the name of a variable (numeric or string) that
contains stratum identifiers.
fpc(varname) requests a finite population correction for the variance
estimates. If varname has values less than or equal to 1, it is
interpreted as a stratum sampling rate f_h = n_h/N_h, where n_h =
number of units sampled from stratum h and N_h = total number of
units in the population belonging to stratum h. If varname has
values greater than or equal to n_h, it is interpreted as containing
N_h. It is an error for varname to have values between 1 and n_h or
to have a mixture of sampling rates and stratus sizes.
+---------+
----+ Weights +----------------------------------------------------------
brrweight(varlist) specifies the replicate-weight variables to be used
with vce(brr).
fay(#) specifies Fay's adjustment. The value specified in fay(#) is used
to adjust the BRR weights and is present in the BRR variance
formulas.
The sampling weight of the selected PSUs for a given replicate is
multiplied by 2-#, where the sampling weight for the unselected PSUs
is multiplied by #. When brrweight(varlist) is specified, the
replicate-weight variables in varlist are assumed to be adjusted
using #.
fay(0) is the default and is equivalent to the original BRR method.
fay(1) is not allowed because this results in unadjusted weights.
jkrweight(varlist, ...) specifies the replicate-weight variables to be
used with vce(jackknife).
The following options set characteristics on the jackknife
replicate-weight variables. If one value is specified, all the
specified jackknife replicate-weight variables will be supplied with
the same characteristic. If multiple values are specified, each
replicate-weight variable will be supplied with the corresponding
value according to the order specified. These options are not shown
in the dialog box.
stratum(# [# ...]) specifies an identifier for the stratum in which
the sampling weights have been adjusted.
fpc(# [# ...]) specifies the FPC value to be added as a
characteristic of the jackknife replicate-weight variables. The
values set by this suboption have the same interpretation as the
fpc(varname) option.
multiplier(# [# ...]) specifies the value of a jackknife multiplier
to be added as a characteristic of the jackknife replicate-weight
variables.
reset indicates that the characteristics for the replicate-weight
variables may be overwritten or reset to the default, if they
exist.
+----+
----+ SE +---------------------------------------------------------------
vce(vcetype) specifies the default method for variance estimation; see
[SVY] variance estimation.
vce(linearized) sets the default to Taylor linearization.
vce(brr) sets the default to balanced repeated replication; also see
[SVY] svy brr.
vce(jackknife) sets the default to the jackknife; see [SVY] svy
jackknife.
mse specifies that the MSE formula be used when vce(brr) or
vce(jackknife) is specified. This option requires vce(brr) or
vce(jackknife).
singleunit(method) specifies how to handle strata with one sampling unit.
singleunit(missing) results in missing values for the standard errors
and is the default.
singleunit(certainty) causes strata with single sampling units to be
treated as certainty units. Certainty units contribute nothing
to the standard error.
singleunit(scaled) results in a scaled version of
singleunit(certainty). The scaling factor comes from using the
average of the variances from the strata with multiple sampling
units for each stratum with one sampling unit.
singleunit(centered) specifies that strata with one sampling unit are
centered at the grand mean instead of the stratum mean.
+--------------------+
----+ Poststratification +-----------------------------------------------
poststrata(varname) specifies the name of the variable (numeric or
string) that contains poststratum identifiers.
postweight(varname) specifies the name of the numeric variable that
contains poststratum population totals (or sizes), i.e., the number
of elementary sampling units in the population within each
poststratum.
The following options are available with svyset but are not shown in the
dialog box:
clear clears all the settings from the data. Typing
. svyset, clear
clears the survey design characteristics from the data in memory.
Although this option may be specified with some of the other svyset
options, it is redundant because svyset automatically clears the
previous settings before setting new survey design characteristics.
noclear allows some of the options in options to be changed without
clearing all the other settings. This option is not allowed with
psu, ssu, design_options, or clear.
clear(opnames) allows some of the options in options to be cleared
without clearing all the other settings. opnames refers to an option
name and may be one or more of the following:
weight vce mse brrweight jkrweight poststrata
This option implies the noclear option.
Examples
Setup
. webuse stage5a
Simple random sampling with replacement
. svyset _n
One-stage clustered design with stratification
. svyset su1 [pweight=pw], strata(strata)
Two-stage designs
. svyset su1 [pweight=pw], fpc(fpc1) || _n, fpc(fpc2)
. svyset su1 [pweight=pw], fpc(fpc1) || su2, fpc(fpc2)
. svyset su1 [pweight=pw], fpc(fpc1) || su2, fpc(fpc2) strata(strata)
Multiple-stage designs
. svyset su1 [pweight=pw], fpc(fpc1) strata(strata) || su2, fpc(fpc2)
|| su3, fpc(fpc3)
. svyset su1 [pweight=pw], fpc(fpc1) strata(strata) || su2, fpc(fpc2)
|| su3, fpc(fpc3) || _n
Finite population correction (FPC)
. webuse fpc
. list
. svyset psuid [pweight=weight], strata(stratid) fpc(Nh)
. svy: mean x
. svyset psuid [pweight=weight], strata(stratid)
. svy: mean x
Multiple-stage designs and with-replacement sampling
. webuse stage5a
. svyset su1 || _n, fpc(fpc2)
Replication weight variables
. webuse stage5a_jkw
. svyset [pweight=pw], jkrweight(jkw_*) vce(jackknife)
. svyset [pweight=pw], jkrweight(jkw_*) vce(jackknife) mse
Setup
. copy
http://www.cdc.gov/nchs/about/major/nhanes/nhanes2001-2002/bpx_b.
> xpt bpx_b.xpt
. copy
http://www.cdc.gov/nchs/about/major/nhanes/nhanes2001-2002/demo_b
> .xpt demo_b.xpt
Combining datasets from multiple surveys
. fdause bpx_b.xpt
. sort seqn
. save bpx01_02
. fdause demo_b.xpt
. drop wtint?yr
. sort seqn
. merge 1:1 seqn using bpx01_02, nogenerate
. svyset sdmvpsu [pw=wtmec2yr], strata(sdmvstra)
. save bpx01_02, replace
. use bpx99_00
. drop wt?rep*
. append using bpx01_02
. drop wtmec2yr
. svyset sdmvpsu [pw=wtmec4yr], strata(sdmvstra)
. save bpx99_02
. svy jackknife: mean bpxsar
Saved results
svyset saves the following in r():
Scalars
e(stages) number of sampling stages
Macros
r(wtype) weight type
r(wexp) weight expression
r(wvar) weight variable name
r(su#) variable identifying sampling units for stage #
r(strata#) variable identifying strata for stage #
r(fpc#) FPC for stage #
r(brrweight) brrweight() variable list
r(fay) Fay's adjustment
r(jkrweight) jkrweight() variable list
r(vce) vcetype specified in vce()
r(mse) mse, if specified
r(poststrata) poststrata() variable
r(postweight) postweight() variable
r(settings) svyset arguments to reproduce the current settings
r(singleunit) singleunit() setting
Also see
Manual: [SVY] svyset
Help: [SVY] survey, [SVY] svy, [SVY] svydescribe