help xtmixed dialog: xtmixed
also see: xtmixed postestimation
-------------------------------------------------------------------------------
Title
[XT] xtmixed -- Multilevel mixed-effects linear regression
Syntax
xtmixed depvar [fe_equation] [|| re_equation] [|| re_equation ...] [,
options]
where the syntax of fe_equation is
[indepvars] [if] [in] [, fe_options]
and the syntax of re_equation is one of the following:
for random coefficients and intercepts
levelvar: [varlist] [, re_options]
for a random effect among the levels of a factor variable
levelvar: R.varname [, re_options]
levelvar is a variable identifying the group structure for the random
effects at that level or _all for the inclusive group comprising all
observations.
fe_options description
-------------------------------------------------------------------------
Model
noconstant suppress constant term from the fixed-effects
equation
-------------------------------------------------------------------------
re_options description
-------------------------------------------------------------------------
Model
covariance(vartype) variance-covariance structure of the random
effects
noconstant suppress constant term from the random-effects
equation
collinear keep collinear variables
-------------------------------------------------------------------------
vartype description
-------------------------------------------------------------------------
independent one variance parameter per random effect, all
covariances zero; the default unless a factor
variable is specified
exchangeable equal variances for random effects, and one
common pairwise covariance
identity equal variances for random effects, all
covariances zero; the default for factor
variables
unstructured all variances and covariances distinctly
estimated
-------------------------------------------------------------------------
options description
-------------------------------------------------------------------------
Model
residuals(rspec) structure of residual errors
Estimation
reml fit model via maximum restricted likelihood, the
default
mle fit model via maximum likelihood
Reporting
level(#) set confidence level; default is level(95)
variance show random-effects parameter estimates as
variances-covariances
noretable suppress random-effects table
nofetable suppress fixed-effects table
estmetric show parameter estimates in the estimation
metric
noheader suppress output header
nogroup suppress table summarizing groups
nostderr do not estimate standard errors of
random-effects parameters
nolrtest do not perform LR test comparing to linear
regression
display_options control spacing and display of omitted variables
and base and empty cells
EM options
emiterate(#) number of EM iterations, default is 20
emtolerance(#) EM convergence tolerance, default is 1e-10
emonly fit model exclusively using EM
emlog show EM iteration log
emdots show EM iterations as dots
Maximization
maximize_options control the maximization process; seldom used
matsqrt parameterize variance components using matrix
square roots; the default
matlog parameterize variance components using matrix
logarithms
+ coeflegend display coefficients' legend instead of
coefficient table
-------------------------------------------------------------------------
+ coeflegend does not appear in the dialog box.
indepvars may contain factor variables; see fvvarlist.
depvar, indepvars, and varlist may contain time-series operators; see
tsvarlist.
bootstrap, by, jackknife, rolling, and statsby are allowed; see prefix.
See [XT] xtmixed postestimation for features available after estimation.
Menu
Statistics > Longitudinal/panel data > Multilevel mixed-effects models >
Mixed-effects linear regression
Description
xtmixed fits linear mixed models. Mixed models are characterized as
containing both fixed effects and random effects. The fixed effects are
analogous to standard regression coefficients and are estimated directly.
The random effects are not directly estimated but are summarized
according to their estimated variances and covariances. Although random
effects are not directly estimated, you can form best linear unbiased
predictions (BLUPs) of them (and standard errors) by using predict after
xtmixed; see [XT] xtmixed postestimation. Random effects may take the
form of either random intercepts or random coefficients, and the grouping
structure of the data may consist of multiple levels of nested groups.
The overall error distribution of the linear mixed model is assumed to be
Gaussian, but heteroskedasticity and correlations within lowest-level
groups also may be modeled.
Options
+-------+
----+ Model +------------------------------------------------------------
noconstant suppresses the constant (intercept) term and may be specified
for the fixed effects equation and for any or all of the
random-effects equations.
covariance(vartype), where vartype is
independent|exchangeable|identity|unstructure
specifies the structure of the (co)variance matrix for the random
effects and may be specified for each random-effects equation. An
independent covariance structure allows a distinct variance for each
random effect within a random-effects equation and assumes that all
covariances are zero. exchangeable covariances have common variances
and one common pairwise covariance. identity is short for "multiple
of the identity"; that is, all variances are equal and all
covariances are zero. unstructured covariances allow all variances
and covariances to be distinct. If an equation consists of p
random-effects terms, the unstructured covariance matrix will have
p(p+1)/2 unique parameters.
covariance(independent) is the default, except when the
random-effects equation consists of the factor-variable specification
R.varname, in which case covariance(identity) is the default, and
only covariance(identity) and covariance(exchangeable) are allowed.
collinear specifies that xtmixed not omit collinear variables from the
random-effects equation. Usually there is no reason to leave
collinear variables in place, and in fact doing so usually causes the
estimation to fail because of the matrix singularity caused by the
collinearity. However, with certain models (for example, a
random-effects model with a full set of contrasts), the variables may
be collinear, yet the model is fully identified because of
restrictions on the random-effects covariance structure. In such
cases, using the collinear option allows the estimation to take place
with the random-effects equation intact.
residuals(rspec), where rspec is
restype [, residual_options]
specifies the structure of the residual errors within the
lowest-level groups of the linear mixed model. For example, if you
are modeling random effects for classes nested within schools, then
residuals() refers to the residual variance-covariance structure of
the observations within classes, the lowest-level groups.
restype is
independent|exchangeable|ar #|ma #|unstructure
By default, restype is independent, which means that all
residuals are i.i.d. Gaussian with one common variance. When
combined with by(varname), independence is still assumed, but you
estimate a distinct variance for each level of varname. Unlike
with the structures described below, varname does not need to be
constant within groups.
restype exchangeable estimates two parameters, one common
within-group variance and one common pairwise covariance.
When combined with by(varname), these two parameters are
distinctly estimated for each level of varname. Because you
are modeling a within-group covariance, varname must be
constant within lowest-level groups.
restype ar # assumes that within-group errors have an
autoregressive (AR) structure of order #; ar 1 is the
default. The t(varname) option is required, where varname is
an integer-valued time variable used to order the
observations within groups and to determine the lags between
successive observations. Any nonconsecutive time values will
be treated as gaps. For this structure, # + 1 parameters are
estimated (# AR coefficients and one overall error variance).
restype ar may be combined with by(varname), but varname must
be constant within groups.
restype ma # assumes that within-group errors have a moving
average (MA) structure of order #; ma 1 is the default. The
t(varname) option is required, where varname is an
integer-valued time variable used to order the observations
within groups and to determine the lags between successive
observations. Any nonconsecutive time values will be treated
as gaps. For this structure, # + 1 parameters are estimated
(# MA coefficients and one overall error variance). restype
ma may be combined with by(varname), but varname must be
constant within groups.
restype unstructured is the most general structure; it estimates
distinct variances for each within-group error and distinct
covariances for each within-group error pair. The t(varname)
option is required, where varname is a
positive-integer-valued variable that identifies the
observations within each group. The groups may be unbalanced
in that not all levels of t() need to be observed within
every group, but you may not have repeated t() values within
any particular group. When you have p levels of t(), then
p*(p+1)/2 parameters are estimated. restype unstructured may
be combined with by(varname), but varname must be constant
within groups.
residual_options are by(varname) and t(varname).
by(varname) is for use within the residuals() option and
specifies that a set of distinct residual-error parameters be
estimated for each level of varname. In other words, you use
by() to model heteroskedasticity.
t(varname) is for use within the residuals() option to specify a
time variable for the ar and ma structures, or to ID the
observations when restype is unstructured.
+------------+
----+ Estimation +-------------------------------------------------------
reml and mle specify the statistical method for fitting the model.
reml, the default, specifies that the model be fit using restricted
maximum likelihood (REML), also referred to as residual maximum
likelihood.
mle specifies that the model be fit using maximum likelihood.
+-----------+
----+ Reporting +--------------------------------------------------------
level(#); see [R] estimation options.
variance displays the random-effects and residual-error parameter
estimates as variances and covariances. The default is to display
them as standard deviations and correlations.
noretable suppresses the random-effects table from the output.
nofetable suppresses the fixed-effects table from the output.
estmetric displays all parameter estimates in the estimation metric.
Fixed-effects estimates are unchanged from those normally displayed,
but random-effects parameter estimates are displayed as log-standard
deviations and hyperbolic arctangents of correlations, with equation
names that organize them by model level. Residual-variance parameter
estimates are also displayed in their original estimation metric.
noheader suppresses the output header, either at estimation or upon
replay.
nogroup suppresses the display of group summary information (number of
groups, average group size, minimum, and maximum) from the output
header.
nostderr prevents xtmixed from calculating standard errors for the
estimated random-effects parameters, although standard errors are
still given for the fixed-effects parameters. Specifying this option
will speed up computation times. nostderr is available only when
residuals are modeled as independent with constant variance.
nolrtest prevents xtmixed from fitting a reference linear regression
model and using this model to calculate a likelihood-ratio test
comparing the mixed model to ordinary regression. This option may
also be specified upon replay to suppress this test from the output.
display_options: noomitted, vsquish, noemptycells, baselevels,
allbaselevels; see [R] estimation options.
+------------+
----+ EM options +-------------------------------------------------------
These options control the EM (expectation-maximization) iterations that
take place before estimation switches to a gradient-based method. When
residuals are modeled as independent with constant variance, EM will
either converge to the solution or bring parameter estimates close to the
solution. For other residual structures, EM is used to obtain starting
values.
emiterate(#) specifies the number of EM (expectation-maximization)
iterations to perform. The default is emiterate(20).
emtolerance(#) specifies the convergence tolerance for the EM
algorithm. The default is emtolerance(1e-10). EM iterations
will be halted once the log (restricted) likelihood changes by a
relative amount less than #. At that point, optimization
switches to a gradient-based method, unless emonly is specified.
emonly specifies that the likelihood be maximized exclusively using
EM. The advantage of specifying emonly is that EM iterations are
typically much faster than those for gradient-based methods. The
disadvantages are that EM iterations can be slow to converge (if
at all) and that EM provides no facility for estimating standard
errors for the random-effects parameters. emonly is available
only when residuals are modeled as independent with constant
variance.
emlog specifies that the EM iteration log be shown. The EM iteration
log is, by default, not displayed unless the emonly option is
specified.
emdots specifies that the EM iterations be shown as dots. This
option can be convenient because the EM algorithm may require
many iterations to converge.
+--------------+
----+ Maximization +-----------------------------------------------------
maximize_options: difficult, technique(algorithm_spec), iterate(#),
[no]log, trace, gradient, showstep, hessian, showtolerance,
tolerance(#), ltolerance(#), nrtolerance(#), nonrtolerance; see [R]
maximize. Those that require special mention for xtmixed are listed
below.
For the technique() option, the default is technique(nr). The bhhh
algorithm may not be specified.
matsqrt (the default), during optimization, parameterizes variance
components by using the matrix square roots of the
variance-covariance matrices formed by these components at each model
level.
matlog, during optimization, parameterizes variance components by using
the matrix logarithms of the variance-covariance matrices formed by
these components at each model level.
Both the matsqrt and matlog parameterizations ensure that
variance-covariance matrices are positive semidefinite. For most
problems, the matrix square root is more stable near the boundary of
the parameter space. However, if convergence is problematic, one
option may be to try the alternate matlog parameterization. When
convergence is not an issue, both parameterizations yield equivalent
results.
The following option is available with xtmixed but is not shown in the
dialog box:
coeflegend; see [R] estimation options.
Remarks on specifying random-effects equations
Mixed models consist of fixed effects and random effects. The fixed
effects are specified as regression parameters in a manner similar to
most other Stata estimation commands, that is, as a dependent variable
followed by a set of regressors. The random-effects portion of the model
is specified by first considering the grouping structure of the data.
For example, if random effects are to vary according to variable school,
then the call to xtmixed would be of the form
. xtmixed fixed_portion || school: ... , options
The variable lists that make up each equation describe how the random
effects enter into the model, either as random intercepts (constant term)
or as random coefficients on regressors in the data. One may also
specify the variance-covariance structure of the within-equation random
effects, according to the four available structures described above. For
example,
. xtmixed f_p || school: z1, covariance(unstructured) options
will fit a model with a random intercept and random slope for variable z1
and treat the variance-covariance structure of these two random effects
as unstructured.
If the data are organized by a series of nested groups, for example,
classes within schools, then the random-effects structure is specified by
a series of equations, each separated by ||. The order of nesting
proceeds from left to right. For our example, this would mean that an
equation for schools would be specified first, followed by an equation
for classes. As an example, consider
. xtmixed f_p || school: z1, cov(un) || class: z1 z2 z3, nocons
cov(ex) options
where variables school and class identify the schools and classes within
schools, respectively. This model contains a random intercept and random
coefficient on z1 at the school level and has random coefficients on
variables z1, z2, and z3 at the class level. The covariance structure
for the random effects at the class level is exchangeable, meaning that
the random effects share a common variance and common pairwise
covariance.
Group variables may be repeated, allowing for more general covariance
structures to be constructed as block-diagonal matrices based on the four
original structures. Consider
. xtmixed f_p || school: z1 z2, nocons cov(id) || school: z3 z4,
nocons cov(un) options
which specifies four random coefficients at the school level. The
variance-covariance matrix of the random effects is the 4 x 4 matrix
where the upper 2 x 2 diagonal block is a multiple of the identity matrix
and the lower 2 x 2 diagonal block is unstructured. In effect, the
coefficients on z1 and z2 are constrained to be independent and share a
common variance. The coefficients on z3 and z4 each have a distinct
variance and a variance distinct from that of the coefficients on z1 and
z2. They are also allowed to be correlated, yet they are independent
from the coefficients on z1 and z2.
For mixed models with no nested grouping structure, thinking of the
entire estimation data as one group is convenient. Toward this end,
xtmixed allows the special group designation _all. xtmixed also allows
the factor variable notation R.varname, which is shorthand for describing
the levels of varname as a series of indicator variables. See [XT]
xtmixed for more details.
Examples
---------------------------------------------------------------------------
Setup
. webuse nlswork
Random-intercept model, analogous to xtreg
. xtmixed ln_w grade age c.age#c.age ttl_exp tenure c.tenure#c.tenure
|| id:
Random-intercept and random-slope (coefficient) model
. xtmixed ln_w grade age c.age#c.age ttl_exp tenure c.tenure#c.tenure
|| id: grade
Random-intercept and random-slope (coefficient) model, correlated random
effects
. xtmixed ln_w grade age c.age#c.age ttl_exp tenure c.tenure#c.tenure
|| id: grade, cov(unstruct)
---------------------------------------------------------------------------
Setup
. webuse pig, clear
One-level random-effects model
. xtmixed weight week || id:
---------------------------------------------------------------------------
Setup
. webuse productivity, clear
Two-level nested model, state nested within region, fit by maximum
likelihood
. xtmixed gsp private emp hwy water other unemp || region: || state:,
mle
---------------------------------------------------------------------------
Setup
. webuse pig, clear
Two-way crossed random effects
. xtmixed weight week || _all: R.id || _all: R.week
---------------------------------------------------------------------------
Setup
. webuse ovary, clear
Linear mixed model with MA 2 errors
. xtmixed follicles sin1 cos1 || mare: sin1, residuals(ma 2, t(time))
---------------------------------------------------------------------------
Setup
. webuse childweight, clear
Linear mixed model with heteroskedastic error variances
. xtmixed weight age || id:age, residuals(independent, by(girl))
---------------------------------------------------------------------------
Saved results
xtmixed saves the following in e():
Scalars
e(N) number of observations
e(k) number of parameters
e(k_f) number of FE parameters
e(k_r) number of RE parameters
e(k_rs) number of standard deviations
e(k_rc) number of correlations
e(k_res) number of residual-error parameters
e(nrgroups) number of residual-error by() groups
e(ar_p) AR order of residual errors, if specified
e(ma_q) MA order of residual errors, if specified
e(df_m) model degrees of freedom
e(ll) log (restricted) likelihood
e(chi2) chi-squared statistic
e(p) p-value for chi-squared
e(ll_c) log likelihood, comparison model
e(chi2_c) chi-squared, comparison model
e(df_c) degrees of freedom, comparison model
e(p_c) p-value, comparison model
e(rank) rank of e(V)
e(rc) return code
e(converged) 1 if converged, 0 otherwise
Macros
e(cmd) xtmixed
e(cmdline) command as typed
e(depvar) name of dependent variable
e(ivars) grouping variables
e(title) title in estimation output
e(redim) random-effects dimensions
e(vartypes) variance-structure types
e(revars) random-effects covariates
e(resopt) residuals() specification, as typed
e(rstructure) residual-error structure
e(rstructlab) residual-error structure output label
e(rbyvar) residual-error by() variable, if specified
e(rglabels) residual-error by() group labels
e(timevar) residual-error t() variable, if specified
e(chi2type) Wald; type of model chi-squared test
e(vce) bootstrap or jackknife if defined
e(vcetype) title used to label Std. Err.
e(method) ML or REML
e(opt) type of optimization
e(optmetric) matsqrt or matlog; random-effects matrix
parameterization
e(ml_method) type of ml method
e(technique) maximization technique
e(crittype) optimization criterion
e(properties) b V
e(estat_cmd) program used to implement estat
e(predict) program used to implement predict
e(asbalanced) factor variables fvset as asbalanced
e(asobserved) factor variables fvset as asobserved
Matrices
e(b) coefficient vector
e(N_g) group counts
e(g_min) group-size minimums
e(g_avg) group-size averages
e(g_max) group-size maximums
e(tmap) ID mapping for unstructured residual errors
e(V) variance-covariance matrix of the estimators
Functions
e(sample) marks estimation sample
Also see
Manual: [XT] xtmixed
Help: [XT] xtmixed postestimation;
[XT] xtmelogit, [XT] xtmepoisson, [XT] xtreg, [XT] xtrc, [XT]
xtgee