.- help for ^svyreg^, ^svylogit^, ^svyprobt^ .- Linear, logistic, and probit regressions for sample-survey data --------------------------------------------------------------- ^svyreg^ varlist [weight] [^if^ exp] [^in^ range] [^,^ common_options ] ^svylogit^ varlist [weight] [^if^ exp] [^in^ range] [^,^ ^or^ maximize_options common_options ] ^svyprobt^ varlist [weight] [^if^ exp] [^in^ range] [^,^ maximize_options common_options ] where common_options are ^nocon^stant ^str^ata^(^varname^)^ ^psu(^varname^)^ ^fpc(^varname^)^ ^sub^pop^(^expression^)^ ^srs^subpop ^noadj^ust ^float^ ^l^evel^(^#^)^ ^p^rob ^ci^ ^deff^ ^deft^ ^meff^ ^meft^ These commands typed without arguments redisplay previous results. The following options can be given when redisplaying results: ^or^ ^l^evel^(^#^)^ ^p^rob ^ci^ ^deff^ ^deft^ ^meff^ ^meft^ ^svyreg^ allows ^pweight^s and ^iweight^s. ^svylogit^ and ^svyprobt^ allow only ^pweight^s. See help ^weights^. These commands share the features of all estimation commands; see help @est@. Warning: Use of ^if^ or ^in^ restrictions will not produce correct variance estimates for subpopulations in many cases. To compute estimates for subpopulations, use the ^subpop()^ option. Description ----------- These commands estimate regression models for complex survey data. ^svyreg^ estimates linear regression. ^svylogit^ estimates maximum-likelihood logistic regression. ^svyprobt^ estimates a maximum-likelihood probit model. The commands allow any or all of the following: probability sampling weights, stratification, and clustering. Associated variance estimates, design effects (^deff^ and ^deft^), and misspec- ification effects (^meff^ and ^meft^) are computed. Models for subpopulations can be estimated using the ^subpop()^ option. For commands to estimate means, totals, ratios, and proportions, see help for @svymean@ and @svyprop@. To display numbers of primary sampling units (PSUs) per stratum and identify strata that cause the error message "stratum with only one PSU detected", see help @svydes@. To estimate linear combinations of coefficients after estimating a model, see help @svylc@. To test multidimensional hypotheses, see help @svytest@. Options ------- ^noconstant^ estimates a model without the constant term (intercept). ^strata(^varname^)^ specifies the name of the variable (numeric or string) that contains stratum identifiers. ^strata()^ can also be specified with the ^varset^ command; see examples below and help @varset@. ^psu(^varname^)^ specifies the name of the variable (numeric or string) that contains identifiers for the primary sampling unit (i.e., the cluster). ^psu()^ can also be specified with the ^varset^ command; see examples below and help @varset@. ^fpc(^varname^)^ requests a finite population correction for the variance estimates. If the variable specified has values less than or equal to 1, it is interpreted as a stratum sampling rate f_h = n_h/N_h, where n_h = number of PSUs sampled from stratum h and N_h = total number of PSUs in the population belonging to stratum h. If the variable specified has values greater than or equal to n_h, it is interpreted as containing N_h. ^fpc()^ can also be specified with the ^varset^ command; see examples below and help @varset@. ^subpop(^expression^)^ specifies that estimates be computed for the single subpopulation defined by the observations for which the specified expression is true. Note that observations with missing values for the variable(s) in this expression may have to be omitted explicitly using an ^if^ statement. See examples below. ^srssubpop^ can only be specified if ^subpop()^ is specified. ^srssubpop^ specifies that ^deff^ and ^deft^ be computed using an estimate of simple-random-sampling variance for sampling within a subpopulation. If ^srssubpop^ is not specified, ^deff^ and ^deft^ are computed using an estimate of simple-random-sampling variance for sampling from the entire population. Typically, ^srssubpop^ would be given when computing subpopulation estimates in a stratum or in a group of strata. ^noadjust^ specifies that the model Wald test be computed as W/k ~ F(k, d), where W is the Wald test statistic, k is the number of terms in the model excluding the constant term, d = total number of sampled PSUs minus the total number of strata, and F(k, d) is an F distribution with k numerator d.f. and d denominator d.f. By default, an adjusted Wald test is computed: (d - k + 1)*W/(k*d) ~ F(k, d - k + 1). ^float^ specifies that covariance computations be done in float precision rather than double (the default). Using double precision requires room for k + 1 variables of type double, where k is the number of variables in the model. Using the ^float^ option requires room for k + 1 variables of type float. Coefficient estimates are always computed in double precision. maximize_options (^svylogit^ and ^svyprobt^ only) control the maximization process; see [7] maximize. You should never have to specify them. The following options can be specified initially or when redisplaying results: ^or^ (^svylogit^ only) reports the estimated coefficients transformed to odds ratios, i.e., exp(b) rather than b. Standard errors and confidence intervals are similarly transformed. ^level(^#^)^ specifies the confidence level (i.e., nominal coverage rate), in percent, for confidence intervals; see help @level@. ^prob^ requests that the t statistic and p-value be displayed. The degrees of freedom for the t statistic are d = total number of sampled PSUs minus the total number of strata (regardless of the number of terms in the model). If no display options are specified, then, by default, the t statistic and p-value are displayed. ^ci^ requests that confidence intervals be displayed. If no display options are specified, then, by default, confidence intervals are displayed. ^deff^ requests that the deff measure of design effects be displayed. ^deft^ requests that the deft measure of design effects be displayed. deft is the square root of deff when there is no finite population correction (FPC). When the ^fpc()^ option has been specified, deft differs slightly from the square root of deff (since Kish's formula for deft always uses a simple-random-sampling variance estimate without an FPC regardless of whether an FPC was used for the survey design-based variance estimate). ^meff^ requests that the meff measure of misspecification effects be displayed. ^meft^ requests that the meft measure of misspecification effects be displayed. In all cases, meft is the square root of meff. Examples -------- Specifying ^strata()^, ^psu()^, ^fpc()^, and ^pweight^ variables -------------------------------------------------------- The ^varset^ command can be used to set the ^strata()^, ^psu()^, ^fpc()^, and ^pweight^ variables: . ^varset strata strn^ . ^varset psu clustid^ . ^varset fpc pop^ . ^varset pweight wgt^ Once these are set, ^strata()^, ^psu()^, and weights ^[pweight=^...^]^ do not have to be specified when issuing a ^svy^ command: . ^svyreg y x1 x2 x3^ . ^svylogit z x1 x2 x3^ . ^svyprobt z x1 x2 x3^ Alternatively, without using ^varset^, we could have typed . ^svyreg y x1 x2 x3 [pweight=wgt], strata(strn) psu(clustid) fpc(pop)^ . ^svylogit z x1 x2 x3 [pweight=wgt], strata(strn) psu(clustid) fpc(pop)^ . ^svyprobt z x1 x2 x3 [pweight=wgt], strata(strn) psu(clustid) fpc(pop)^ Note that no matter which of these methods are used initially to set ^strata()^, ^psu()^, ^fpc()^, and ^pweight^, the settings are remembered and do not have to be specified in subsequent use of any of the ^svy^ commands. For more information, see help @varset@. Basic command use ----------------- Assuming that any or all of ^strata()^, ^psu()^, ^fpc()^, and ^pweight^ have been specified: . ^svyreg y x1 x2 x3^ . ^svylogit z x1 x2 x3^ . ^svyprobt z x1 x2 x3^ . ^svyreg y x1 x2 x3, nocons^ . ^svylogit y x1 x2 x3, nocons^ . ^svyprobt y x1 x2 x3, nocons^ Subpopulation models -------------------- Models for a subpopulation can be estimated using the ^subpop()^ option: . ^svyreg y x1 x2 x3, subpop(gender==1)^ . ^svyreg y x1 x2 x3, subpop(age > 50)^ . ^svylogit z x1 x2 x3, subpop(gender==1)^ . ^svyprobt z x1 x2 x3, subpop(age > 50)^ The ^srssubpop^ option is typically used to compute deff and deft when the subpopulation is a stratum or a group of strata: . ^svyreg y x1 x2, subpop(region==2) srssubpop^ When the subpop() option is used, observations with missing values for the subpopulation variable(s) must be explicitly omitted using an ^if^ statement: . ^svyreg y x1 x2 x3 if age~=., subpop(age < 40)^ . ^svylogit z x1 x2 if age~=. & region~=., subpop(age < 40 & region==1)^ Display options --------------- Display options can be given when first issuing a command: . ^svyreg y x1 x2 x3, prob deff meff^ Or they can be given when redisplaying results: . ^svyreg, prob ci deff meff^ . ^svyreg, prob deft meft^ . ^svyreg^ The ^or^ option can be used with ^svylogit^ to see results transformed to odds ratios: . ^svylogit disease gender age1 age2 age3, or^ . ^svylogit, or^ Linear combinations of coefficients ----------------------------------- The ^svylc^ (lc = linear combination) command can be used to estimate linear combinations of coefficients: . ^svyreg y x1 x2 x3^ . ^svylc x1 - x2^ After ^svylogit^, the ^or^ option can be used to see results transformed to odds ratios: . ^svylogit disease gender age1 age2 age3^ . ^svylc age2 - age1, or^ See help @svylc@ for more information. Hypothesis tests ---------------- The ^svytest^ command can be used to test multidimensional hypotheses after any ^svy^ estimation command: . ^svyreg y x1 x2 x3 x4^ . ^svytest x3 x4^ . ^svytest x3 = x4^ By default, ^svytest^ uses an adjusted Wald test. An unadjusted Wald test can also be computed: . ^svytest x3 x4, noadjust^ Bonferroni adjustments can be computed for hypotheses of the form x1=0, x2=0, etc. . ^svytest x3 x4, bonferroni^ See help @svytest@ for more information. Note on ^iweight^s ---------------- ^iweight^s can be used with ^svyreg^ when there are negative weights: . ^svyreg y x [iweight=wgt]^ ^iweight^s must be specified with each use; they are not remembered. Also see -------- On-line: help for @svydes@, @svylc@, @svymean@, @svyprop@, @svytest@, @varset@