This page contains only historical information and is not about the current
release of Stata.
Please see our Stata 10 page
for information on the current version of Stata.

Survey statistics
With the addition of balanced and repeated replications (BRR) and the addition
of survey jackknife, Stata is now the only full-featured statistical package
to directly support all three major variance estimators for survey and correlated
data: BRR, jackknife, and cluster-based linearization.
Complete support is now included for multistage designs and for
poststratification. To handle multistage designs, you specify the design when
you svyset your data. To set a two-stage cluster design on city and
schools within cities, stratifying on state, you set
|
. svyset city [pw=wgt], strata(state) fpc(ncities) || school, fpc(nschools)
If you also wanted to poststratify on gender, you would add
poststrata(gender) postweight(ngender).
After this svyset, all Stata’s survey estimators will properly
account for your design.
Here are all the details.
A new, unified syntax is used for declaring the design of survey data and for
fitting models. For an overview of all survey facilities, see [SVY] survey.
All the old syntax continues to work under version control, although the survey
estimation commands do not even require that, but if you use old syntax, the
new features will not be available.
- Existing command svyset for declaring the survey design has new
syntax that supports a host of new features in Stata’s survey-analysis
facilities:
- BRR and jackknife variance estimators have been added to the
previously available linearization variance estimator.
Moreover, use of BRR or jackknife (or linearization) can now be
specified when you svyset or at estimation time.
- Multistage designs can now be declared, and they may have primary,
secondary, and lower-stage sampling units. The linearization variance
estimator takes complete advantage of the information in multistage
designs.
- Stratification is now allowed in all stages, making variance estimates
more efficient wherever stratification can be exploited.
- Poststratification is now available and, like stratification,
also makes variance estimates more efficient. Poststratification
adjusts weights, improves variance estimates, and accounts for biases
when demographic or other groupings are known.
- Finite-population corrections are now allowed in all stages.
- Sampling weights are handled under all three variance estimators.
For details, see [SVY] svyset.
The previous svyset syntax continues to work under version control.
- New prefix command svy: is how you tell estimators that you have survey
data. You no longer type svyregress; you type svy: regress.
This is not just a matter of style; svy really is a prefix command,
and in fact, you can even use it as a prefix on estimation commands you
write. In addition, svy: provides a standard, unified syntax for
accessing Stata’s survey features, and svy: is easy to use because
it automatically applies everything you have previously svyset,
including the design.
The following estimators can be used with svy: prefix:
Descriptive statistics
| svy: mean |
Population and subpopulation means |
| svy: proportion |
Population and subpopulation proportions |
| svy: ratio |
Population and subpopulation ratios |
| svy: total |
Population and subpopulation totals |
| svy: tabulate oneway |
One-way tables for survey data |
| svy: tabulate twoway |
Two-way tables for survey data |
Regression models
| svy: regress |
Linear regression |
| svy: ivreg |
Instrumental variables regression |
| svy: intreg |
Interval and censored regression |
| svy: logistic |
Logistic regression, reporting odds ratios |
| svy: logit |
Logistic regression, reporting coefficients |
| svy: probit |
Probit regression |
| svy: mlogit |
Multinomial logistic regression |
| svy: ologit |
Ordered logistic regression |
| svy: oprobit |
Ordered probit models |
| svy: poisson |
Poisson regression |
| svy: nbreg |
Negative binomial regression |
| svy: gnbreg |
Generalized negative binomial regression |
| svy: heckman |
Heckman selection model |
| svy: heckprob
| Probit estimation with selection |
Previously existing survey-estimation commands, such as svyregress,
svymean, and svypoisson, continue to work as they did before,
but only if your survey design is declared using version 8: svyset
or if you are working with an old Stata 8 dataset. For a mapping from old
estimation commands to the new syntax, see
svy8.
(The new prefix svy: works with datasets that were svyset
under an earlier release of Stata.)
In addition to the three variance estimators and support for multistage
sampling, the new svy: prefix provides other enhancements, including
- Option subpop() allows more flexible selection of subpopulations,
meaning that more general if conditions are now allowed.
- Strata with only one sampling unit (sometimes called singleton PSUs)
are now handled better—the coefficients are now reported, but with
missing standard errors. svydes can now be used to find and
describe these strata; see
[SVY] svydes.
- With BRR variance estimation, a Hadamard matrix can be used in
place of BRR weights, and Fay’s adjustment may be specified;
see [SVY] brr_options.
- New command svy: proportion replaces svyprop. (By the way, new command
proportion can be used without the svy: prefix; see [R] proportion.)
Unlike svyprop, svy: proportion is an estimation command and
computes a full covariance matrix for all the estimated proportions,
allowing postestimation features, such as tests of linear and nonlinear
combinations of proportions
(test and
testnl) or creation of linear
and nonlinear combinations with confidence intervals
(lincom and
nlcom).
- New commands ratio, total, and mean, used with the svy: prefix,
use casewise deletion and estimate full covariance matrices for the
estimates.
- New command svy: tabulate oneway addresses a missing feature.
Previously, anyone wanting a one-way tabulation had to create a
constant and perform two-way survey tabulation with that constant.
- New command estat computes and reports additional statistics and
information after estimation with svy: prefix:
- estat svyset reports complete information on the survey
design.
- estat effects computes and reports the design
effects—DEFF and DEFT—and the misspecification
effects—MEFF and MEFT—in any combination for each
estimated parameter.
- estat effects can also compute DEFF and DEFT for subpopulations
using simple random-sample estimates either from the overall
population or from the subpopulation. estat effects
replaces and extends the deff, deft, meff, and meft
options previously available on survey estimators.
- estat lceffects computes and reports the survey design
effects and misspecification effects for any linear
combination of estimated parameters.
- estat size reports the sample and population sizes for
each subpopulation after svy: mean, svy: proportion,
svy: ratio, and svy: total.
For details on estat after survey estimation, see [SVY] estat.
- Existing command svydes has several new features and options:
- New option stage() lets you select the sampling stage
for sample statistics to be reported.
- New option generate() identifies strata with a single
sampling unit.
- New option finalstage replaces bypsu and reports
observation sample statistics by sampling unit in the final stage.
- New options stdize() and stdweight() for commands svy: mean,
svy: ratio, svy: proportion, :svy: tabulate oneway, and
svy: tabulate twoway allow direct standardization of means, ratios,
proportions, and tabulations using any of the three survey variance
estimators.
- Programmers of estimation commands can get full support for estimation with
survey and correlated data almost automatically. This support includes
correct treatment of multistage designs, weighting, stratification,
poststratification, and finite-population corrections, as well as access to
all three variance estimators. For a discussion, see
[P] program properties.
- The [SVY] manual now has a glossary that defines commonly used terms in
survey analysis and explains how these terms are used in the manual;
see [SVY] glossary.
|
|