.- help for ^svymean^, ^svyratio^, ^svytotal^ .- Estimation of means, totals, and ratios for sample-survey data -------------------------------------------------------------- ^svymean^ varlist [weight] [^if^ exp] [^in^ range] [^,^ options ] ^svytotal^ varlist [weight] [^if^ exp] [^in^ range] [^,^ options ] ^svyratio^ varname[/]varname [varname[/]varname ... ] [weight] [^if^ exp] [^in^ range] [^,^ options ] where options are ^str^ata^(^varname^)^ ^psu(^varname^)^ ^fpc(^varname^)^ ^by(^varlist^)^ ^sub^pop^(^expression^)^ ^srs^subpop ^nolab^el { ^com^plete | ^av^ailable } ^float^ ^l^evel^(^#^)^ ^ci^ ^deff^ ^deft^ ^meff^ ^meft^ ^obs^ ^size^ ^svymean^, ^svyratio^, and ^svytotal^ typed without arguments redisplay previous results. The following options can be given when redisplaying results: ^svy^cmd [^,^ ^l^evel^(^#^)^ ^ci^ ^deff^ ^deft^ ^meff^ ^meft^ ^obs^ ^size^ ] ^pweight^s and ^iweight^s are allowed; see help ^weights^. Warning: Use of ^if^ or ^in^ restrictions will not produce correct variance estimates for subpopulations in many cases. To compute estimates for subpopulations, use the ^by()^ or ^subpop()^ options. Description ----------- These commands produce estimates of finite-population means, ratios, and totals for complex survey data with any or all of the following: probability sampling weights, stratification, and clustering. Associated variance estimates, design effects (^deff^ and ^deft^), and misspecification effects (^meff^ and ^meft^) are also computed. The commands can produce estimates for subpopulations using either the ^by()^ option (for multiple subpopulations defined by the ^by^ varlist) or the ^subpop()^ option (for a single subpopulation defined by an expression). Proportions can be estimated either using ^svymean^ with 0/1 indicator variables or with the ^svyprop^ command; see help @svyprop@. For linear regression, logistic regression, and probit estimation, see help @svyreg@. To display numbers of primary sampling units (PSUs) per stratum and identify strata that cause the error message "stratum with only one PSU detected", see help @svydes@. To estimate linear combinations (e.g., differences of subpopulation means) after running an ^svy^ command, see help @svylc@. Options ------- ^strata(^varname^)^ specifies the name of the variable (numeric or string) that contains stratum identifiers. ^strata()^ can also be specified with the ^varset^ command; see examples below and help @varset@. ^psu(^varname^)^ specifies the name of the variable (numeric or string) that contains identifiers for the primary sampling unit (i.e., the cluster). ^psu()^ can also be specified with the ^varset^ command; see examples below and help @varset@. ^fpc(^varname^)^ requests a finite population correction for the variance estimates. If the variable specified has values less than or equal to 1, it is interpreted as a stratum sampling rate f_h = n_h/N_h, where n_h = number of PSUs sampled from stratum h and N_h = total number of PSUs in the population belonging to stratum h. If the variable specified has values greater than or equal to n_h, it is interpreted as containing N_h. ^fpc()^ can also be specified with the ^varset^ command; see examples below and help @varset@. ^by(^varlist^)^ specifies that estimates be computed for the subpopulations defined by different values of the variable(s) in the specified varlist. ^subpop(^expression^)^ specifies that estimates be computed for the single subpopulation defined by the observations for which the specified expression is true. Note that observations with missing values for the variable(s) in this expression may have to be omitted explicitly using an ^if^ statement. See examples below. ^srssubpop^ can only be specified if ^by()^ or ^subpop()^ is specified. ^srssubpop^ specifies that ^deff^ and ^deft^ be computed using an estimate of simple-random-sampling variance for sampling within a subpopulation. If ^srssubpop^ is not specified, ^deff^ and ^deft^ are computed using an estimate of simple-random-sampling variance for sampling from the entire population. Typically, ^srssubpop^ would be given when computing subpopulation estimates by strata or by groups of strata. ^nolabel^ can only be specified if ^by()^ is specified. ^nolabel^ requests that numeric values rather than value labels be used to label output for subpopulations. By default, value labels are used. { ^complete^ | ^available^ } specifies how missing values are to be handled. ^complete^ specifies that only observations with complete data should be used; i.e., any observation that has a missing value for any of the variables in the varlist is omitted from the computation. ^available^ specifies that all available nonmissing values be used for each estimation. If neither ^complete^ nor ^available^ is specified, ^available^ is the default when there are missing values and there are two or more variables in the varlist (or four or more for ^svyratio^). If there are missing values and two or more variables (or four or more for ^svyratio^), ^complete^ must be specified to compute the covariance or to use ^test^ (for hypothesis tests) or ^svylc^ (estimates for linear combinations) after running the command; see help @svylc@. The ^complete^ computation requires room for k variables, where k is the number of variables in the varlist times the number of subpopulations. The ^available^ computation has no such requirements. So the ^available^ option can be used to reduce memory requirements. ^float^ specifies that computations with the ^complete^ option be done in float precision rather than double (the default). Using double precision with the ^complete^ option requires room for k variables of type double, where k is the number of variables in the varlist times the number of sub- populations. Using the ^float^ option requires room for k variables of type float. Computations with the ^available^ option are always done in double precision and do not require room for a large number of variables. The following options can be specified initially or when redisplaying results: ^level(^#^)^ specifies the confidence level (i.e., nominal coverage rate), in percent, for confidence intervals; see help @level@. ^ci^ requests that confidence intervals be displayed. If no display options are specified, then, by default, confidence intervals are displayed. ^deff^ requests that the design-effect measure deff be displayed. If no display options are specified, then, by default, deff is displayed. ^deft^ requests that the design-effect measure deft be displayed. deft is the square root of deff when there is no finite population correction (FPC). When the ^fpc()^ option has been specified, deft differs slightly from the square root of deff (since Kish's formula for deft always uses a simple-random-sampling variance estimate without an FPC regardless of whether an FPC was used for the survey design-based variance estimate). ^meff^ requests that the meff measure of misspecification effects be displayed. ^meft^ requests that the meft measure of misspecification effects be displayed. In all cases, meft is the square root of meff. ^obs^ requests that the number of nonmissing observations used for the computation of the estimate be displayed for each row of estimates. ^size^ requests that the estimate of the (sub)population size be displayed for each row of estimates. The (sub)population size estimate equals the sum of the weights for those nonmissing observations used for the mean/total/ ratio estimate. Examples -------- Specifying ^strata()^, ^psu()^, ^fpc()^, and ^pweight^ variables -------------------------------------------------------- The ^varset^ command can be used to set the ^strata()^, ^psu()^, ^fpc()^, and ^pweight^ variables: . ^varset strata strn^ . ^varset psu clustid^ . ^varset fpc pop^ . ^varset pweight wgt^ Once these are set, ^strata()^, ^psu()^, and weights ^[pweight=^...^]^ do not have to be specified when issuing a ^svy^ command: . ^svymean tcresult tgresult^ . ^svytotal female^ . ^svyratio weight/height^ Alternatively, without using ^varset^, we could have typed . ^svymean tcresult tgresult [pweight=wgt], strata(strn) psu(clustid) fpc(pop)^ Note that no matter which of these methods are used initially to set ^strata()^, ^psu()^, ^fpc()^, and ^pweight^, the settings are remembered and do not have to be specified in subsequent use of any of the ^svy^ commands. For more information, see help @varset@. Basic command use ----------------- Assuming that any or all of ^strata()^, ^psu()^, ^fpc()^, and ^pweight^ have been specified: . ^svymean x^ . ^svymean x1 x2 x3^ . ^svytotal x^ . ^svytotal x1 x2 x3^ Note that ^/^ is optional with ^svyratio^. Thus, the following are equivalent: . ^svyratio y1/y2^ . ^svyratio y1 y2^ Similarly, . ^svyratio y1/y2 y3/y4^ . ^svyratio y1 y2 y3 y4^ . ^svyratio y1/y2 y3 y4^ are all equivalent. The ^complete^ option is used when you wish to estimate several parameters based on complete cases; i.e., only observations with nonmissing values for all relevant variables are used. This option is necessary when you later want to get estimates for linear combinations: . ^svymean x1 x2, complete^ . ^svylc x1 - x2^ See help @svylc@ for more information. Subpopulations -------------- . ^svymean x, by(gender)^ . ^svytotal x1 x2 x3, by(gender race)^ . ^svyratio y/x, by(agegrp) nolabel^ . ^svymean x1 x2 x3, subpop(gender==1)^ . ^svyratio y/x, subpop(age > 50)^ Note that ^subpop()^ and ^by()^ can be combined: . ^svymean x1 x2 x3, subpop(gender==1) by(agegrp race)^ The ^srssubpop^ option tells the command to compute deff and deft using simple random sampling within a subpopulation as the comparison. It is typically used when the subpopulations are strata (or groups of strata) and is occasionally used in other circumstances as well. . ^svymean x, by(strn) srssubpop^ When the subpop() option is used, observations with missing values for the subpopulation variable(s) must be explicitly omitted using an ^if^ statement: . ^svymean bp if age~=., subpop(age < 40)^ . ^svyratio y/x if age~=. & region~=., subpop(age < 40 & region==1)^ Display options --------------- Display options can be given when first issuing a command: . ^svymean x1 x2 x3, by(gender) ci deff meff^ Or they can be given when redisplaying results: . ^svymean, ci deff meff obs size^ . ^svyratio, ci obs^ . ^svytotal^ Note on ^iweight^s ---------------- ^iweight^s can be used when there are negative weights: . ^svymean x [iweight=wgt]^ ^iweight^s must be specified with each use; they are not remembered. Also see -------- On-line: help for @svydes@, @svylc@, @svyprop@, @svyreg@, @svytest@, @varset@