Stata 15 help for mixed

[ME] mixed -- Multilevel mixed-effects linear regression

Syntax

mixed depvar fe_equation [|| re_equation] [|| re_equation ...] [, options]

where the syntax of fe_equation is

[indepvars] [if] [in] [weight] [, fe_options]

and the syntax of re_equation is one of the following:

for random coefficients and intercepts

levelvar: [varlist] [, re_options]

for random effects among the values of a factor variable

levelvar: R.varname [, re_options]

levelvar is a variable identifying the group structure for the random effects at that level or is _all representing one group comprising all observations.

fe_options Description ------------------------------------------------------------------------- Model noconstant suppress constant term from the fixed-effects equation -------------------------------------------------------------------------

re_options Description ------------------------------------------------------------------------- Model covariance(vartype) variance-covariance structure of the random effects noconstant suppress constant term from the random-effects equation collinear keep collinear variables fweight(exp) frequency weights at higher levels pweight(exp) sampling weights at higher levels -------------------------------------------------------------------------

options Description ------------------------------------------------------------------------- Model mle fit model via maximum likelihood (ML); the default reml fit model via restricted maximum likelihood (REML) dfmethod(df_method) specify method for computing degrees of freedom (DF) of a t distribution pwscale(scale_method) control scaling of sampling weights in two-level models residuals(rspec) structure of residual errors

SE/Robust vce(vcetype) vcetype may be oim, robust, or cluster clustvar; types other than oim may not be combined with dfmethod()

Reporting level(#) set confidence level; default is level(95) variance show random-effects and residual-error parameter estimates as variances and covariances; the default stddeviations show random-effects and residual-error parameter estimates as standard deviations and correlations dftable(dftable) specify contents of fixed-effects table; requires dfmethod() at estimation noretable suppress random-effects table nofetable suppress fixed-effects table estmetric show parameter estimates as stored in e(b) noheader suppress output header nogroup suppress table summarizing groups nostderr do not estimate standard errors of random-effects parameters display_options control columns and column formats, row spacing, line width, display of omitted variables and base and empty cells, and factor-variable labeling

EM options emiterate(#) number of EM iterations; default is emiterate(20) emtolerance(#) EM convergence tolerance; default is emtolerance(1e-10) emonly fit model exclusively using EM emlog show EM iteration log emdots show EM iterations as dots

Maximization maximize_options control the maximization process; seldom used matsqrt parameterize variance components using matrix square roots; the default matlog parameterize variance components using matrix logarithms

small replay small-sample inference results coeflegend display legend instead of statistics -------------------------------------------------------------------------

vartype Description ------------------------------------------------------------------------- independent one unique variance parameter per random effect, all covariances 0; the default unless the R. notation is used exchangeable equal variances for random effects, and one common pairwise covariance identity equal variances for random effects, all covariances 0; the default if the R. notation is used unstructured all variances and covariances to be distinctly estimated -------------------------------------------------------------------------

df_method Description ------------------------------------------------------------------------- residual residual degrees of freedom, n - rank(X) repeated repeated-measures ANOVA anova ANOVA satterthwaite[, dfopts] generalized Satterthwaite approximation; REML estimation only kroger[, dfopts] Kenward-Roger; REML estimation only -------------------------------------------------------------------------

dftable Description ------------------------------------------------------------------------- default test statistics, p-values, and confidence intervals; the default ci DFs and confidence intervals pvalue DFs, test statistics, and p-values -------------------------------------------------------------------------

indepvars may contain factor variables; see fvvarlist. depvar, indepvars, and varlist may contain time-series operators; see tsvarlist. bayes, bootstrap, by, jackknife, mi estimate, rolling, and statsby are allowed; see prefix. For more details, see [BAYES] bayes: mixed. mi estimate is not allowed if dfmethod() is specified. Weights are not allowed with the bootstrap prefix. pweights and fweights are allowed; see weight. However, no weights are allowed if either option reml or option dfmethod() is specified. small and coeflegend do not appear in the dialog box. See [ME] mixed postestimation for features available after estimation.

Menu

Statistics > Multilevel mixed-effects models > Linear regression

Description

mixed fits linear mixed-effects models. These models are also known as multilevel models or hierarchical linear models. The overall error distribution of the linear mixed-effects model is assumed to be Gaussian, and heteroskedasticity and correlations within lowest-level groups also may be modeled.

Options

+-------+ ----+ Model +------------------------------------------------------------

noconstant suppresses the constant (intercept) term and may be specified for the fixed-effects equation and for any of or all the random-effects equations.

covariance(vartype) specifies the structure of the covariance matrix for the random effects and may be specified for each random-effects equation. vartype is one of the following: independent, exchangeable, identity, or unstructured.

independent allows for a distinct variance for each random effect within a random-effects equation and assumes that all covariances are 0.

exchangeable specifies one common variance for all random effects and one common pairwise covariance.

identity is short for "multiple of the identity"; that is, all variances are equal and all covariances are 0.

unstructured allows for all variances and covariances to be distinct. If an equation consists of p random-effects terms, the unstructured covariance matrix will have p(p+1)/2 unique parameters.

covariance(independent) is the default, except when the R. notation is used, in which case covariance(identity) is the default and only covariance(identity) and covariance(exchangeable) are allowed.

collinear specifies that mixed not omit collinear variables from the random-effects equation. Usually, there is no reason to leave collinear variables in place; in fact, doing so usually causes the estimation to fail because of the matrix singularity caused by the collinearity. However, with certain models (for example, a random-effects model with a full set of contrasts), the variables may be collinear, yet the model is fully identified because of restrictions on the random-effects covariance structure. In such cases, using the collinear option allows the estimation to take place with the random-effects equation intact.

fweight(exp) specifies frequency weights at higher levels in a multilevel model, whereas frequency weights at the first level (the observation level) are specified in the usual manner, for example, [fw=fwtvar1]. exp can be any valid Stata variable, and you can specify fweight() at levels two and higher of a multilevel model. For example, in the two-level model

. mixed fixed_portion [fw = wt1] || school: ... , fweight(wt2) ...

the variable wt1 would hold the first-level (the observation-level) frequency weights, and wt2 would hold the second-level (the school-level) frequency weights.

pweight(exp) specifies sampling weights at higher levels in a multilevel model, whereas sampling weights at the first level (the observation level) are specified in the usual manner, for example, [pw=pwtvar1]. exp can be any valid Stata variable, and you can specify pweight() at each levels two and higher of a multilevel model. For example, in the two-level model

. mixed fixed_portion [pw = wt1] || school: ... , pweight(wt2) ...

variable wt1 would hold the first-level (the observation-level) sampling weights, and wt2 would hold the second-level (the school-level) sampling weights.

See Remarks on using sampling weights below for more information regarding the use of sampling weights in multilevel models.

mle and reml specify the statistical method for fitting the model.

mle, the default, specifies that the model be fit using ML. Options dfmethod(satterthwaite) and dfmethod(kroger) are not supported under ML estimation.

reml specifies that the model be fit using REML, also known as residual maximum likelihood.

dfmethod(df_method) requests that reported hypothesis tests for the fixed effects (coefficients) use a small-sample adjustment. By default, inference is based on a large-sample approximation of the sampling distributions of the test statistics by normal and chi-squared distributions. Caution should be exercised when choosing a DF method; see Small-sample inference for fixed effects in [ME] mixed for details.

When dfmethod(df_method) is specified, the sampling distributions of the test statistics are approximated by a t distribution, according to the requested method for computing the DF. df_method is one of the following: residual, repeated, anova, satterthwaite, or kroger.

residual uses the residual degrees of freedom, n - rank(X), as the DF for all tests of fixed effects. For a linear model without random effects with independent and identically distributed errors, the distributions of the test statistics for fixed effects are t distributions with the residual DF. For other mixed-effects models, this method typically leads to poor approximations of the actual sampling distributions of the test statistics.

repeated uses the repeated-measures ANOVA method for computing the DF. It is used with balanced repeated-measures designs with spherical correlation error structures. It partitions the residual degrees of freedom into the between-subject degrees of freedom and the within-subject degrees of freedom. repeated is supported only with two-level models. For more complex mixed-effects models or with unbalanced data, this method typically leads to poor approximations of the actual sampling distributions of the test statistics.

anova uses the traditional ANOVA method for computing the DF. According to this method, the DF for a test of a fixed effect of a given variable depends on whether that variable is also included in any of the random-effects equations. For traditional ANOVA models with balanced designs, this method provides exact sampling distributions of the test statistics. For more complex mixed-effects models or with unbalanced data, this method typically leads to poor approximations of the actual sampling distributions of the test statistics.

satterthwaite[, dfopts] implements a generalization of the Satterthwaite (1946) approximation of the unknown sampling distributions of test statistics for complex linear mixed-effect models. This method is supported only with REML estimation.

kroger[, dfopts] implements the Kenward and Roger (1997) method, which is designed to approximate unknown sampling distributions of test statistics for complex linear mixed-effects models. This method is supported only with REML estimation.

dfopts is either eim or oim.

eim specifies that the expected information matrix be used to compute Satterthwaite or Kenward-Roger degrees of freedom. This is the default.

oim specifies that the observed information matrix be used to compute Satterthwaite or Kenward-Roger degrees of freedom.

Residual, repeated, and ANOVA methods are suitable only when the sampling distributions of test statistics are known to be t or F. This is usually only known for certain classes of linear mixed-effects models with simple covariance structures and when data are balanced. These methods are available with both ML and REML estimation.

For unbalanced data or balanced data with complicated covariance structures, the sampling distributions of the test statistics are unknown and can only be approximated. The Satterthwaite and Kenward-Roger methods provide approximations to the distributions in these cases. According to Schaalje, McBride, and Fellingham (2002), the Kenward-Roger method should, in general, be preferred to the Satterthwaite method. However, there are situations in which the two methods are expected to perform similarly, such as with compound symmetry covariance structures. The Kenward-Roger method is more computationally demanding than the Satterthwaite method. Both methods are available only with REML estimation. See Small-sample inference for fixed effects under Remarks and examples in [ME] mixed for examples and more detailed descriptions of the DF methods.

dfmethod() may not be combined with weighted estimation, the mi estimate prefix, or vce() unless it is the default vce(oim).

pwscale(scale_method) controls how sampling weights (if specified) are scaled in two-level models. scale_method is one of the following: size, effective, or gk.

size specifies that first-level (observation-level) weights be scaled so that they sum to the sample size of their corresponding second-level cluster. Second-level sampling weights are left unchanged.

effective specifies that first-level weights be scaled so that they sum to the effective sample size of their corresponding second-level cluster. Second-level sampling weights are left unchanged.

gk specifies the Graubard and Korn (1996) method. Under this method, second-level weights are set to the cluster averages of the products of the weights at both levels, and first-level weights are then set equal to 1.

pwscale() is supported only with two-level models. See Survey data under Remarks and examples in [ME] mixed for more details on using pwscale(). pwscale() may not be combined with the dfmethod() option.

residuals(rspec) specifies the structure of the residual errors within the lowest-level groups (the second level of a multilevel model with the observations comprising the first level) of the linear mixed model. For example, if you are modeling random effects for classes nested within schools, then residuals() refers to the residual variance-covariance structure of the observations within classes, the lowest-level groups. rspec has the following syntax:

restype [, residual_options]

restype is one of the following: independent, exchangeable, ar #, ma #, unstructured, banded #, toeplitz #, or exponential.

independent, the default, specifies that all residuals be independent and identically distributed Gaussian with one common variance. When combined with by(varname), independence is still assumed, but you estimate a distinct variance for each level of varname. Unlike with the structures described below, varname does not need to be constant within groups.

exchangeable estimates two parameters, one common within-group variance and one common pairwise covariance. When combined with by(varname), these two parameters are distinctly estimated for each level of varname. Because you are modeling a within-group covariance, varname must be constant within lowest-level groups.

ar # assumes that within-group errors have an autoregressive (AR) structure of order #; ar 1 is the default. The t(varname) option is required, where varname is an integer-valued time variable used to order the observations within groups and to determine the lags between successive observations. Any nonconsecutive time values will be treated as gaps. For this structure, # + 1 parameters are estimated (# AR coefficients and one overall error variance). restype ar may be combined with by(varname), but varname must be constant within groups.

ma # assumes that within-group errors have a moving average (MA) structure of order #; ma 1 is the default. The t(varname) option is required, where varname is an integer-valued time variable used to order the observations within groups and to determine the lags between successive observations. Any nonconsecutive time values will be treated as gaps. For this structure, # + 1 parameters are estimated (# MA coefficients and one overall error variance). restype ma may be combined with by(varname), but varname must be constant within groups.

unstructured is the most general structure; it estimates distinct variances for each within-group error and distinct covariances for each within-group error pair. The t(varname) option is required, where varname is a nonnegative-integer-valued variable that identifies the observations within each group. The groups may be unbalanced in that not all levels of t() need to be observed within every group, but you may not have repeated t() values within any particular group. When you have p levels of t(), then p(p+1)/2 parameters are estimated. restype unstructured may be combined with by(varname), but varname must be constant within groups.

banded # is a special case of unstructured that restricts estimation to the covariances within the first # off-diagonals and sets the covariances outside this band to 0. The t(varname) option is required, where varname is a nonnegative-integer-valued variable that identifies the observations within each group. # is an integer between 0 and p-1, where p is the number of levels of t(). By default, # is p-1; that is, all elements of the covariance matrix are estimated. When # is 0, only the diagonal elements of the covariance matrix are estimated. restype banded may be combined with by(varname), but varname must be constant within groups.

toeplitz # assumes that within-group errors have Toeplitz structure of order #, for which correlations are constant with respect to time lags less than or equal to # and are 0 for lags greater than #. The t(varname) option is required, where varname is an integer-valued time variable used to order the observations within groups and to determine the lags between successive observations. # is an integer between 1 and the maximum observed lag (the default). Any nonconsecutive time values will be treated as gaps. For this structure, # + 1 parameters are estimated (# correlations and one overall error variance). restype toeplitz may be combined with by(varname), but varname must be constant within groups.

exponential is a generalization of the AR covariance model that allows for unequally spaced and noninteger time values. The t( varname) option is required, where varname is real-valued. For the exponential covariance model, the correlation between two errors is the parameter rho, raised to a power equal to the absolute value of the difference between the t() values for those errors. For this structure, two parameters are estimated (the correlation parameter rho and one overall error variance). restype exponential may be combined with by(varname), but varname must be constant within groups.

residual_options are by(varname) and t(varname).

by(varname) is for use within the residuals() option and specifies that a set of distinct residual-error parameters be estimated for each level of varname. In other words, you use by() to model heteroskedasticity.

t(varname) is for use within the residuals() option to specify a time variable for the ar, ma, toeplitz, and exponential structures, or to identify the observations when restype is unstructured or banded.

+-----------+ ----+ SE/Robust +--------------------------------------------------------

vce(vcetype) specifies the type of standard error reported, which includes types that are derived from asymptotic theory (oim), that are robust to some kinds of misspecification (robust), and that allow for intragroup correlation (cluster clustvar); see [R] vce_option. If vce(robust) is specified, robust variances are clustered at the highest level in the multilevel model.

vce(robust) and vce(cluster clustvar) are not supported with REML estimation. Only vce(oim) is allowed in combination with dfmethod().

+-----------+ ----+ Reporting +--------------------------------------------------------

level(#); see [R] estimation options.

variance, the default, displays the random-effects and residual-error parameter estimates as variances and covariances.

stddeviations displays the random-effects and residual-error parameter estimates as standard deviations and correlations.

dftable(dftable) specifies the contents of the fixed-effects table for small-sample inference when dfmethod() is used during estimation. dftable is one of the following: default, ci, or pvalue.

default displays the default standard fixed-effects table that contains test statistics, p-values, and confidence intervals.

ci displays the fixed-effects table in which the columns containing statistics and p-values are replaced with a column containing coefficient-specific DFs. Confidence intervals are also displayed.

pvalue displays the fixed-effects table that includes a column containing DFs with the standard columns containing test statistics and p-values. Confidence intervals are not displayed.

noretable suppresses the random-effects table from the output.

nofetable suppresses the fixed-effects table from the output.

estmetric displays all parameter estimates in one table using the metric in which they are stored in e(b). The results are stored in the same metric regardless of the parameterization of the variance components, matsqrt or matlog, used at estimation time. Random-effects parameter estimates are stored as log-standard deviations and hyperbolic arctangents of correlations, with equation names that organize them by model level. Residual-variance parameter estimates are stored as log-standard deviations and, when applicable, as hyperbolic arctangents of correlations. Note that fixed-effects estimates are always stored and displayed in the same metric.

noheader suppresses the output header, either at estimation or upon replay.

nogroup suppresses the display of group summary information (number of groups, average group size, minimum, and maximum) from the output header.

nostderr prevents mixed from calculating standard errors for the estimated random-effects parameters, although standard errors are still provided for the fixed-effects parameters. Specifying this option will speed up computation times. nostderr is available only when residuals are modeled as independent with constant variance.

display_options: noci, nopvalues, noomitted, vsquish, noemptycells, baselevels, allbaselevels, nofvlabel, fvwrap(#), fvwrapon(style), cformat(%fmt), pformat(%fmt), sformat(%fmt), and nolstretch; see [R] estimation options.

+------------+ ----+ EM options +-------------------------------------------------------

These options control the expectation-maximization (EM) iterations that take place before estimation switches to a gradient-based method. When residuals are modeled as independent with constant variance, EM will either converge to the solution or bring parameter estimates close to the solution. For other residual structures or for weighted estimation, EM is used to obtain starting values.

emiterate(#) specifies the number of EM iterations to perform. The default is emiterate(20).

emtolerance(#) specifies the convergence tolerance for the EM algorithm. The default is emtolerance(1e-10). EM iterations will be halted once the log (restricted) likelihood changes by a relative amount less than #. At that point, optimization switches to a gradient-based method, unless emonly is specified, in which case maximization stops.

emonly specifies that the likelihood be maximized exclusively using EM. The advantage of specifying emonly is that EM iterations are typically much faster than those for gradient-based methods. The disadvantages are that EM iterations can be slow to converge (if at all) and that EM provides no facility for estimating standard errors for the random-effects parameters. emonly is available only with unweighted estimation and when residuals are modeled as independent with constant variance.

emlog specifies that the EM iteration log be shown. The EM iteration log is, by default, not displayed unless the emonly option is specified.

emdots specifies that the EM iterations be shown as dots. This option can be convenient because the EM algorithm may require many iterations to converge.

+--------------+ ----+ Maximization +-----------------------------------------------------

maximize_options: difficult, technique(algorithm_spec), iterate(#), [no]log, trace, gradient, showstep, hessian, showtolerance, tolerance(#), ltolerance(#), nrtolerance(#), and nonrtolerance; see [R] maximize. Those that require special mention for mixed are listed below.

For the technique() option, the default is technique(nr). The bhhh algorithm may not be specified.

matsqrt (the default), during optimization, parameterizes variance components by using the matrix square roots of the variance-covariance matrices formed by these components at each model level.

matlog, during optimization, parameterizes variance components by using the matrix logarithms of the variance-covariance matrices formed by these components at each model level.

The matsqrt parameterization ensures that variance-covariance matrices are positive semidefinite, while matlog ensures matrices that are positive definite. For most problems, the matrix square root is more stable near the boundary of the parameter space. However, if convergence is problematic, one option may be to try the alternate matlog parameterization. When convergence is not an issue, both parameterizations yield equivalent results.

The following options are available with mixed but are not shown in the dialog box:

small replays previously obtained small-sample results. This option is available only upon replay and requires that the dfmethod() option be used during estimation. small is equivalent to dftable(default) upon replay.

coeflegend; see [R] estimation options.

Remarks

Remarks are presented under the following headings:

Remarks on specifying random-effects equations Remarks on using sampling weights Remarks on small-sample inference for fixed effects

Remarks on specifying random-effects equations

Mixed models consist of fixed effects and random effects. The fixed effects are specified as regression parameters in a manner similar to most other Stata estimation commands, that is, as a dependent variable followed by a set of regressors. The random-effects portion of the model is specified by first considering the grouping structure of the data. For example, if random effects are to vary according to variable school, then the call to mixed would be of the form

. mixed fixed_portion || school: ... , options

The variable lists that make up each equation describe how the random effects enter into the model, either as random intercepts (constant term) or as random coefficients on regressors in the data. One may also specify the variance-covariance structure of the within-equation random effects, according to the four available structures described above. For example,

. mixed f_p || school: z1, covariance(unstructured) options

will fit a model with a random intercept and random slope for variable z1 and treat the variance-covariance structure of these two random effects as unstructured.

If the data are organized by a series of nested groups, for example, classes within schools, then the random-effects structure is specified by a series of equations, each separated by ||. The order of nesting proceeds from left to right. For our example, this would mean that an equation for schools would be specified first, followed by an equation for classes. As an example, consider

. mixed f_p || school: z1, cov(un) || class: z1 z2 z3, nocons cov(ex) options

where variables school and class identify the schools and classes within schools, respectively. This model contains a random intercept and random coefficient on z1 at the school level and has random coefficients on variables z1, z2, and z3 at the class level. The covariance structure for the random effects at the class level is exchangeable, meaning that the random effects share a common variance and common pairwise covariance.

Group variables may be repeated, allowing for more general covariance structures to be constructed as block-diagonal matrices based on the four original structures. Consider

. mixed f_p || school: z1 z2, nocons cov(id) || school: z3 z4, nocons cov(un) options

which specifies four random coefficients at the school level. The variance-covariance matrix of the random effects is the 4 x 4 matrix where the upper 2 x 2 diagonal block is a multiple of the identity matrix and the lower 2 x 2 diagonal block is unstructured. In effect, the coefficients on z1 and z2 are constrained to be independent and share a common variance. The coefficients on z3 and z4 each have a distinct variance and a variance distinct from that of the coefficients on z1 and z2. They are also allowed to be correlated, yet they are independent from the coefficients on z1 and z2.

For mixed models with no nested grouping structure, thinking of the entire estimation data as one group is convenient. Toward this end, mixed allows the special group designation _all. mixed also allows the R.varname notation, which is shorthand for describing the levels of varname as a series of indicator variables. See Crossed-effects models in [ME] mixed for more details.

Remarks on using sampling weights

Sampling weights are treated differently in multilevel models than they are in standard models such as OLS regression. In a multilevel model, observation-level weights are not indicative of overall inclusion. Instead, they indicate inclusion conditional on the corresponding cluster being included at the next highest-level of sampling.

For example, if you include only observation-level weights in a two-level model, mixed will assume sampling with equal probabilities at level two, and this may or may not be what you intended. If the sampling at level two is weighted, then including only level-one weights can lead to biased results even if weighting at level two has been incorporated into the level-one weight variable. For example, it is a common practice to multiply conditional weights from multiple levels into one overall weight. By contrast, weighted multilevel analysis requires the component weights from each level of sampling.

Even if you specify sampling weights at all model levels, the scale of sampling weights at lower levels can affect your estimated parameters in a multilevel model. That is, not only do the relative sizes of the weights at lower levels matter, the scale of these weights matters also. To deal with this, mixed has the pwscale() option for rescaling weights in two-level models; see above for more information on pwscale(). Three scaling methods are offered, with each method known to perform well under certain data situations and posited models.

In general, exercise caution when using sampling weights with mixed; see Survey data in [ME] mixed for more information.

Remarks on small-sample inference for fixed effects

By default, mixed performs large-sample inference for fixed effects using asymptotic normal and chi-squared distributions. These large-sample approximations may not be appropriate in small samples, and t and F distributions may provide better approximations. You can specify the dfmethod() option to request small-sample inference for fixed effects. mixed, dfmethod() uses a t distribution for one-hypothesis tests and an F distribution for multiple-hypotheses tests for inference about fixed effects. It also provides five different methods for calculating the DF: residual, repeated, anova, satterthwaite, and kroger. See Small-sample inference for fixed effects in [ME] mixed for more information.

Examples

--------------------------------------------------------------------------- Setup . webuse nlswork

Random-intercept model, analogous to xtreg . mixed ln_w grade age c.age#c.age ttl_exp tenure c.tenure#c.tenure || id:

Random-intercept and random-slope (coefficient) model . mixed ln_w grade age c.age#c.age ttl_exp tenure c.tenure#c.tenure || id: tenure

Random-intercept and random-slope (coefficient) model, correlated random effects . mixed ln_w grade age c.age#c.age ttl_exp tenure c.tenure#c.tenure || id: tenure, cov(unstruct)

--------------------------------------------------------------------------- Setup . webuse pig

Two-level model . mixed weight week || id:

Two-level model with robust standard errors . mixed weight week || id:, vce(robust)

--------------------------------------------------------------------------- Setup . webuse productivity

Three-level nested model, observations nested within state nested within region, fit by maximum likelihood . mixed gsp private emp hwy water other unemp || region: || state:, mle

Three-level nested random interactions model with ANOVA DF . mixed gsp private emp hwy water other unemp || region:water || state:other, dfmethod(anova)

--------------------------------------------------------------------------- Setup . webuse pig

Two-way crossed random effects . mixed weight week || _all: R.id || _all: R.week

--------------------------------------------------------------------------- Setup . webuse ovary

Linear mixed model with MA 2 errors . mixed follicles sin1 cos1 || mare: sin1, residuals(ma 2, t(time))

--------------------------------------------------------------------------- Setup . webuse childweight

Linear mixed model with heteroskedastic error variances . mixed weight age || id:age, residuals(independent, by(girl))

--------------------------------------------------------------------------- Setup . webuse pig

Random-intercept and random-slope model with Kenward-Roger DF . mixed weight week || id:week, reml dfmethod(kroger)

Display degrees-of-freedom table containing p-values . mixed, dftable(pvalue)

Display degrees-of-freedom table containing confidence intervals . mixed, dftable(ci)

--------------------------------------------------------------------------- Setup . webuse t43

Repeated-measures model with the repeated DF . mixed score i.drug || person:, reml dfmethod(repeated)

Replay large-sample results . mixed

Replay small-sample results using the repeated DF . mixed, small

Stored results

mixed stores the following in e():

Scalars e(N) number of observations e(k) number of parameters e(k_f) number of fixed-effects parameters e(k_r) number of random-effects parameters e(k_rs) number of variances e(k_rc) number of covariances e(k_res) number of residual-error parameters e(N_clust) number of clusters e(nrgroups) number of residual-error by() groups e(ar_p) AR order of residual errors, if specified e(ma_q) MA order of residual errors, if specified e(res_order) order of residual-error structure, if appropriate e(df_m) model degrees of freedom e(small) 1 if dfmethod() option specified, 0 otherwise e(F) overall F test statistic when dfmethod() is specified e(ddf_m) model DDF e(df_max) maximum DF e(df_avg) average DF e(df_min) minimum DF e(ll) log (restricted) likelihood e(chi2) chi-squared e(p) p-value for model test e(ll_c) log likelihood, comparison model e(chi2_c) chi-squared, comparison test e(df_c) degrees of freedom, comparison test e(p_c) p-value for comparison test e(rank) rank of e(V) e(ic) number of iterations e(rc) return code e(converged) 1 if converged, 0 otherwise

Macros e(cmd) mixed e(cmdline) command as typed e(depvar) name of dependent variable e(wtype) weight type (first-level weights) e(wexp) weight expression (first-level weights) e(fweightk) fweight variable for kth highest level, if specified e(pweightk) pweight variable for kth highest level, if specified e(ivars) grouping variables e(title) title in estimation output e(redim) random-effects dimensions e(vartypes) variance-structure types e(revars) random-effects covariates e(resopt) residuals() specification, as typed e(rstructure) residual-error structure e(rstructlab) residual-error structure output label e(rbyvar) residual-error by() variable, if specified e(rglabels) residual-error by() group labels e(pwscale) sampling-weight scaling method e(timevar) residual-error t() variable, if specified e(dfmethod) DF method specified in dfmethod() e(dftitle) title for DF method e(chi2type) Wald; type of model chi-squared test e(clustvar) name of cluster variable e(vce) vcetype specified in vce() e(vcetype) title used to label Std. Err. e(method) ML or REML e(opt) type of optimization e(optmetric) matsqrt or matlog; random-effects matrix parameterization e(emonly) emonly, if specified e(ml_method) type of ml method e(technique) maximization technique e(datasignature) the checksum e(datasignaturevars) variables used in calculation of checksum e(properties) b V e(estat_cmd) program used to implement estat e(predict) program used to implement predict e(marginswtype) weight type for margins e(marginswexp) weight expression for margins e(asbalanced) factor variables fvset as asbalanced e(asobserved) factor variables fvset as asobserved

Matrices e(b) coefficient vector e(N_g) group counts e(g_min) group-size minimums e(g_avg) group-size averages e(g_max) group-size maximums e(tmap) ID mapping for unstructured residual errors e(V) variance-covariance matrix of the estimators e(V_modelbased) model-based variance e(df) parameter-specific DF for fixed effects e(V_df) variance-covariance matrix of the estimators when dfmethod(kroger) is specified

Functions e(sample) marks estimation sample

References

Graubard, B. I., and E. L. Korn. 1996. Modelling the sampling design in the analysis of health surveys. Statistical Methods in Medical Research 5: 263-281.

Kenward, M. G., and J. H. Roger. 1997. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics 53: 983-997.

Satterthwaite, F. E. 1946. An approximate distribution of estimates of variance components. Biometrics Bulletin 2: 110-114.

Schaalje, G. B., J. B. McBride, and G. W. Fellingham. 2002. Adequacy of approximations to distributions of test statistics in complex mixed linear models. Journal of Agricultural, Biological, and Environmental Statistics 7: 512-524.


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index