help _robust
-------------------------------------------------------------------------------
Title
[P] _robust -- Robust variance estimates
Syntax
_robust varlist [if] [in] [weight] [, variance(matname) minus(#)
strata(varname) psu(varname) cluster(varname) fpc(varname)
subpop(varname) vsrs(matname) srssubpop zeroweight]
_robust works with models that have all types of varlists, including
those with factor variables and times-series operators; see fvvarlist
and tsvarlist.
pweights, aweights, fweights, and iweights are allowed; see weight.
Description
_robust is a programmer's command that computes a robust variance
estimator based on a varlist of equation-level scores and a covariance
matrix. It produces estimators for ordinary data (each observation
independent), clustered data (data not independent within groups, but
independent across groups), and complex survey data from one stage of
stratified cluster sampling.
See [P] _robust for a full description of this command.
Options
variance(matname) specifies a matrix containing the unadjusted
"covariance" matrix, i.e., the D in V=DMD. The matrix must have its
rows and columns labeled with the appropriate corresponding variable
names, i.e., the names of the xs in xb. If there are multiple
equations, the matrix must have equation names; see [P] matrix
rownames. The D is overwritten with the robust covariance matrix V.
If variance() is not specified, Stata assumes that D has been posted
using ereturn post; _robust will then automatically post the robust
covariance matrix V and replace D.
minus(#) specifies k=# for the multiplier n/(n-k) of the robust variance
estimators. Stata's maximum likelihood commands use k=1, and so does
the svy prefix. regress, vce(robust) uses, by default, this
multiplier with k equal to the number of explanatory variables in the
model, including the constant. The default is minus(1).
strata(varname) specifies the name of a variable (numeric or string) that
contains stratum identifiers.
psu(varname) specifies the name of a variable (numeric or string) that
contains identifiers for the primary sampling unit (PSU). psu() and
cluster() are synonyms; they both specify the same thing.
cluster(varname) is a synonym for psu().
fpc(varname) requests a finite population correction for the variance
estimates. If the variable specified has values <= 1, it is
interpreted as a stratum sampling rate f_h = n_h/N_h, where n_h =
number of PSUs sampled from stratum h and N_h = total number of PSUs
in the population belonging to stratum h. If the variable specified
has values greater than 1, it is interpreted as containing N_h.
subpop(varname) specifies that estimates be computed for the single
subpopulation defined by observations for which varname!=0 (and is
not missing). This option would typically be used only with survey
data; see [SVY] subpopulation estimation.
vsrs(matname) creates a matrix containing V_srswor, an estimate of the
variance that would have been observed had the data been collected
using simple random sampling without replacement. This is used to
compute design effects for survey data; see [SVY] estat.
srssubpop can only be specified if vsrs() and subpop() are specified.
srssubpop requests that the estimate of simple-random-sampling
variance, vsrs(), be computed assuming sampling within a
subpopulation. If srssubpop is not specified, it is computed
assuming sampling from the entire population.
zeroweight specifies whether observations with weights equal to zero
should be omitted from the computation. This option does not apply
to fweights; observations with 0 fweights are always omitted. If
zeroweight is specified, observations with zero weights are included
in the computation. If zeroweight is not specified (the default),
observations with zero weights are omitted. Including the
observations with zero weights affects the computation in that it may
change the counts of PSUs (clusters) per stratum. Stata's svy prefix
command includes observations with zero weights; all other commands
exclude them. This option is typically used only with survey data.
Examples
. webuse _robust
. regress mpg weight gear_ratio foreign, mse1
. matrix D = e(V)
. predict double e, residual
. _robust e, v(D) minus(4)
. matrix list D
Saved results
_robust saves the following in r():
Scalars
r(N) number of observation
r(N_strata) number of strata
r(N_clust) number of clusters (PSUs)
r(sum_w) sum of weights
r(N_subpop) number of observations for subpopulation (subpop()
only)
r(sum_wsub) sum of weights for subpopulation (subpop() only)
r(N_strata) and r(N_clust) are alway set. If the strata() option is not
specified, then r(N_strata) = 1 (there truly is one stratum). If neither
the cluster() nor the psu() option is specified, then r(N_clust) equals
the number of observations (each observation is a PSU).
When _robust alters the post of ereturn post, it also saves the following
in e():
Macros
e(vcetype) Robust
e(clustvar) name of cluster (PSU) variable
Also see
Manual: [P] _robust
Help: [P] ereturn, [R] ml, [R] regress, [SVY] svy, [U] 20 Estimation
and postestimation commands (estimation)