Stata 15 help for mi estimate

[MI] mi estimate -- Estimation using multiple imputations


Compute MI estimates of coefficients by fitting estimation command to mi data

mi estimate [, options] : estimation_command ...

Compute MI estimates of transformed coefficients by fitting estimation command to mi data

mi estimate [spec] [, options] : estimation_command ...

where spec may be one or more terms of the form ([name:] exp). exp is any function of the parameter estimates allowed by nlcom.

options Description ------------------------------------------------------------------------- Options nimputations(#) specify number of imputations to use; default is to use all existing imputations imputations(numlist) specify which imputations to use mcerror compute Monte Carlo error estimates ufmitest perform unrestricted FMI model test nosmall do not apply small-sample correction to degrees of freedom saving(miestfile[, replace]) save individual estimation results to miestfile.ster

Tables [no]citable suppress/display standard estimation table containing parameter-specific confidence intervals; default is citable dftable display degrees-of-freedom table; dftable implies nocitable vartable display variance information about estimates; vartable implies citable table_options control table output display_options control columns and column formats, row spacing, display of omitted variables and base and empty cells, and factor-variable labeling

Reporting level(#) set confidence level; default is level(95) dots display dots as estimations are performed noisily display any output from estimation_command (and from nlcom if transformations specified) trace trace estimation_command (and nlcom if transformations specified); implies noisily nogroup suppress summary about groups displayed for xt commands me_options control output from mixed-effects commands

Advanced esample(newvar) store estimation sample in variable newvar; available only in the flong and flongsep styles errorok allow estimation even when estimation_command (or nlcom) errors out; such imputations are discarded from the analysis esampvaryok allow estimation when estimation sample varies across imputations cmdok allow estimation when estimation_command is not one of the supported estimation commands

coeflegend display legend instead of statistics nowarning suppress the warning about varying estimation sample eform_option display coefficients table in exponentiated form post post estimated coefficients and VCE to e(b) and e(V) noupdate do not perform mi update; see [MI] noupdate option ------------------------------------------------------------------------- You must mi set your data before using mi estimate; see [MI] mi set. coeflegend, nowarning, eform_option, post, and noupdate do not appear in the dialog box.

table_options Description ------------------------------------------------------------------------- noheader suppress table header(s) notable suppress table(s) nocoef suppress table output related to coefficients nocmdlegend suppress command legend that appears in the presence of transformed coefficients when nocoef is used notrcoef suppress table output related to transformed coefficients nolegend suppress table legend(s) nocnsreport do not display constraints -------------------------------------------------------------------------

See [MI] mi estimate postestimation for features available after estimation. mi estimate is its own estimation command. The postestimation features for mi estimate do not include by default the postestimation features for estimation_command. To replay results, type mi estimate without arguments.


Statistics > Multiple imputation


mi estimate: estimation_command runs estimation_command on the imputed mi data, and adjusts coefficients and standard errors for the variability between imputations according to the combination rules by Rubin (1987).


+---------+ ----+ Options +----------------------------------------------------------

nimputations(#) specifies that the first # imputations be used; # must be M_min <= # <= M, where M_min = 3 if mcerror is specified and M_min = 2, otherwise. The default is to use all imputations, M. Only one of nimputations() or imputations() may be specified.

imputations(numlist) specifies which imputations to use. The default is to use all of them. numlist must contain at least two numbers. If mcerror is specified, numlist must contain at least three numbers. Only one of nimputations() or imputations() may be specified.

mcerror specifies to compute Monte Carlo error (MCE) estimates for the results displayed in the estimation, degrees-of-freedom, and variance-information tables. MCE estimates reflect variability of MI results across repeated uses of the same imputation procedure and are useful for determining an adequate number of imputations to obtain stable MI results; see White, Royston, and Wood (2011) for details and guidelines.

MCE estimates are obtained by applying the jackknife procedure to multiple-imputation results. That is, the jackknife pseudovalues of MI results are obtained by omitting one imputation at a time; see [R] jackknife for details about the jackknife procedure. As such, the MCE computation requires at least three imputations.

If level() is specified during estimation, MCE estimates are obtained for confidence intervals using the specified confidence level instead of using the default 95% confidence level. If any of the options described in [R] eform_option is specified during estimation, MCE estimates for the coefficients, standard errors, and confidence intervals in the exponentiated form are also computed. mcerror can also be used upon replay to display MCE estimates. Otherwise, MCE estimates are not reported upon replay even if they were previously computed.

ufmitest specifies that the unrestricted fraction missing information (FMI) model test be used. The default test performed assumes equal fractions of information missing due to nonresponse for all coefficients. This is equivalent to the assumption that the between-imputation and within-imputation variances are proportional. The unrestricted test may be preferable when this assumption is suspect provided the number of imputations is large relative to the number of estimated coefficients.

nosmall specifies that no small-sample correction be made to the degrees of freedom. The small-sample correction is made by default to estimation commands that account for small samples. If the command stores residual degrees of freedom in e(df_r), individual tests of coefficients (and transformed coefficients) use the small-sample correction of Barnard and Rubin (1999) and the overall model test uses the small-sample correction of Reiter (2007). If the command does not store residual degrees of freedom, the large-sample test is used and the nosmall option has no effect.

saving(miestfile [, replace]) saves estimation results from each model fit in miestfile.ster. The replace suboption specifies to overwrite miestfile.ster if it exists. miestfile.ster can later be used by mi estimate using (see [MI] mi estimate using) to obtain MI estimates of coefficients or of transformed coefficients without refitting the completed-data models. This file is written in the format used by estimates use; see [R] estimates save.

+--------+ ----+ Tables +-----------------------------------------------------------

All table options below may be specified at estimation time or when redisplaying previously estimated results. Table options must be specified as options to mi estimate, not to estimation_command.

citable and nocitable specify whether the standard estimation table containing parameter-specific confidence intervals is displayed. The default is citable. nocitable can be used with vartable to suppress the confidence interval table.

dftable displays a table containing parameter-specific degrees of freedom and percentages of increase in standard errors due to nonresponse. dftable implies nocitable.

vartable displays a table reporting variance information about MI estimates. The table contains estimates of within-imputation variances, between-imputation variances, total variances, relative increases in variance due to nonresponse, fractions of information about parameter estimates missing due to nonresponse, and relative efficiencies for using finite M rather than a hypothetically infinite number of imputations. vartable implies citable.

table_options control the appearance of all displayed table output:

noheader suppresses all header information from the output. The table output is still displayed.

notable suppresses all tables from the output. The header information is still displayed.

nocoef suppresses the display of tables containing coefficient estimates. This option affects the table output produced by citable, dftable, and vartable.

nocmdlegend suppresses the table legend showing the specified command line, estimation_command, from the output. This legend appears above the tables containing transformed coefficients (or above the variance-information table if vartable is used) when nocoef is specified.

notrcoef suppresses the display of tables containing estimates of transformed coefficients (if specified). This option affects the table output produced by citable, dftable, and vartable.

nolegend suppresses all table legends from the output.

nocnsreport; see [R] estimation options.

display_options: noci, nopvalues, noomitted, vsquish, noemptycells, baselevels, allbaselevels, nofvlabel, fvwrap(#), fvwrapon(style), cformat(%fmt), pformat(%fmt), and sformat(%fmt); see [R] estimation options.

+-----------+ ----+ Reporting +--------------------------------------------------------

Reporting options must be specified as options to mi estimate and not as options to estimation_command.

level(#); see [R] estimation options.

dots specifies that dots be displayed as estimations are successfully completed. An x is displayed if the estimation_command returns an error, if the model fails to converge, or if nlcom fails to estimate one of the transformed coefficients specified in spec.

noisily specifies that any output from estimation_command and nlcom, used to obtain the estimates of transformed coefficients if transformations are specified, be displayed.

trace traces the execution of estimation_command and traces nlcom if transformations are specified. trace implies noisily.

nogroup suppresses the display of group summary information (number of groups, average group size, minimum, and maximum) as well as other command-specific information displayed for xt commands; see the list of commands under Panel-data models in [MI] mi estimation.

me_options: stddeviations, variance, noretable, nofetable, and estmetric. These options are relevant only with the mixed-effects commands meqrlogit (see [ME] meqrlogit), meqrpoisson (see [ME] meqrpoisson), and mixed (see [ME] mixed). See the corresponding mixed-effects commands for more information. The stddeviations option is the default with mi estimate. The estmetric option is implied when vartable or dftable is used.

+----------+ ----+ Advanced +---------------------------------------------------------

esample(newvar) creates newvar containing e(sample). This option is useful to identify which observations were used in the estimation, especially when the estimation sample varies across imputations (see Potential problems that can arise when using mi estimate for details). newvar is zero in the original data (m=0) and in any imputations (m>0) in which the estimation failed or that were not used in the computation. esample() may be specified only if the data are flong or flongsep; see [MI] mi convert to convert to one of those styles. The variable created will be super varying and therefore must not be registered; see [MI] mi varying for more explanation. The saved estimation sample newvar may be used later with mi extract (see [MI] mi extract) to set the estimation sample.

errorok specifies that estimations that fail be skipped and the combined results be based on the successful individual estimation results. The default is that mi estimate stops if an individual estimation fails. If errorok is specified with saving(), all estimation results, including failed, are saved to a file.

esampvaryok allows estimation to continue even if the estimation sample varies across imputations. mi estimate stops if the estimation sample varies. If esampvaryok is specified, results from all imputations are used to compute MI estimates and a warning message is displayed at the bottom of the table. Also see the esample() option. See Potential problems that can arise when using mi estimate for more information.

cmdok allows unsupported estimation commands to be used with mi estimate; see [MI] estimation for a list of supported estimation commands. Alternatively, if you want mi estimate to work with your estimation command, add the property mi to the program properties; see [P] program properties.

The following options are available with mi estimate but are not shown in the dialog box:

coeflegend; see [R] estimation options. coeflegend implies nocitable and cannot be combined with citable or dftable.

nowarning suppresses the warning message at the bottom of table output that occurs if the estimation sample varies and esampvaryok is specified. See Potential problems that can arise when using mi estimate for details.

eform_option; see [R] eform_option. Regardless of the estimation_command specified, mi estimate reports results in the coefficient metric under which the combination rules are applied. You may use the appropriate eform_option to redisplay results in exponentiated form, if desired. If dftable is also specified, the reported degrees of freedom and percentage increases in standard errors are not adjusted and correspond to the original coefficient metric.

post requests that MI estimates of coefficients and their respective VCEs be posted in the usual way. This allows the use of estimation_command-specific postestimation tools with MI estimates. There are issues; see Using the command-specific postestimation tools in [MI] mi estimate postestimation. post may be specified at estimation time or when redisplaying previously estimated results.

noupdate in some cases suppresses the automatic mi update this command might perform; see [MI] noupdate option. This option is seldom used.

Example 1

Estimate on completed data using logit . webuse mheart1s20 . mi describe . mi estimate, dots: logit attack smokes age bmi hsgrad female

Replay estimation results . mi estimate

Display coefficient-specific degrees of freedom . mi estimate, dftable

Show coefficient-specific variance information . mi estimate, vartable nocitable

Display odds ratios . mi estimate, or

Compute Monte Carlo error estimates . mi estimate, dots mcerror: logit attack smokes age bmi hsgrad female

Compute Monte Carlo error estimates for odds ratios . mi estimate, dots mcerror or: logit attack smokes age bmi hsgrad female

Example 2

Estimate on completed data using stcox . webuse mdrugtrs25 . mi describe . mi stset studytime, failure(died) . mi estimate, dots: stcox drug age

Redisplay results as hazard ratios . mi estimate, hr

Example 3

Estimate on completed data using xtreg . webuse mjsps5, clear . mi xtset school . mi estimate: xtreg math5 math3

Example 4

Estimate on completed data using mixed . webuse mjsps5 . mi estimate, dots: mixed math5 math3 || school:, reml

Redisplay results as variance components . mi estimate, variance

Example 5

Estimate on completed data of specified linear regression and additionally specified transformation of those coefficients . webuse mhouses1993s30 . mi estimate (ratio: _b[age]/_b[sqft]): regress price tax sqft age nfeatures ne custom corner

Stored results

mi estimate stores the following in e():

Scalars e(df_avg[_Q]_mi) average degrees of freedom e(df_c_mi) complete degrees of freedom (if originally stored by estimation_command in e(df_r)) e(df_max[_Q]_mi) maximum degrees of freedom e(df_min[_Q]_mi) minimum degrees of freedom e(df_m_mi) MI model test denominator (residual) degrees of freedom e(df_r_mi) MI model test numerator (model) degrees of freedom e(esampvary_mi) varying-estimation sample flag (0 or 1) e(F_mi) model test F statistic e(k_exp_mi) number of expressions (transformed coefficients) e(M_mi) number of imputations e(N_mi) number of observations (minimum, if varies) e(N_min_mi) minimum number of observations e(N_max_mi) maximum number of observations e(N_g_mi) number of groups e(g_min_mi) smallest group size e(g_avg_mi) average group size e(g_max_mi) largest group size e(p_mi) MI model test p-value e(cilevel_mi) confidence level used to compute Monte Carlo error estimates of confidence intervals e(fmi_max[_Q]_mi) largest FMI e(rvi_avg[_Q]_mi) average RVI e(rvi_avg_F_mi) average RVI associated with the residual degrees of freedom for model test e(ufmi_mi) 1 if unrestricted FMI model test is performed, 0 if equal FMI model test is performed

Macros e(mi) mi e(cmdline_mi) command as typed e(prefix_mi) mi estimate e(cmd_mi) name of estimation_command e(cmd) mi estimate (equals e(cmd_mi) when post is used) e(title_mi) "Multiple-imputation estimates" e(wvce_mi) title used to label within-imputation variance in the table header e(modeltest_mi) title used to label the model test in the table header e(dfadjust_mi) title used to label the degrees-of-freedom adjustment in the table header e(expnames_mi) names of expressions specified in spec e(exp#_mi) expressions of the transformed coefficients specified in spec e(rc_mi) return codes for each imputation e(m_mi) specified imputation numbers e(m_est_mi) imputation numbers used in the computation e(names_vvl_mi) command-specific e() macro names that contents varied across imputations e(names_vvm_mi) command-specific e() matrix names that values varied across imputations (excluding b, V, and Cns) e(names_vvs_mi) command-specific e() scalar names that values varied across imputations

Matrices e(b) MI estimates of coefficients (equals e(b_mi); stored only if post is used) e(V) variance-covariance matrix (equals e(V_mi); stored only if post is used) e(Cns) constraint matrix, for constrained estimation only (equals e(Cns_mi); stored only if post is used) e(N_g_mi) group counts e(g_min_mi) group-size minimums e(g_avg_mi) group-size averages e(g_max_mi) group-size maximums e(b[_Q]_mi) MI estimates of coefficients (or transformed coefficients) e(V[_Q]_mi) variance-covariance matrix (total variance) e(Cns_mi) constraint matrix (for constrained estimation only) e(W[_Q]_mi) within-imputation variance matrix e(B[_Q]_mi) between-imputation variance matrix e(re[_Q]_mi) parameter-specific relative efficiencies e(rvi[_Q]_mi) parameter-specific RVIs e(fmi[_Q]_mi) parameter-specific FMIs e(df[_Q]_mi) parameter-specific degrees of freedom e(pise[_Q]_mi) parameter-specific percentages increase in standard errors e(vs_names_vs_mi) values of command-specific e() scalar vs_names that varied across imputations

vs_names include (but are not restricted to) df_r, N, N_strata, N_psu, N_pop, N_sub, N_poststrata, N_stdize, N_subpop, N_over, and converged.

Results N_g_mi, g_min_mi, g_avg_mi, and g_max_mi are stored for panel-data models only. The results are stored as matrices for mixed-effects models and as scalars for other panel-data models.

If transformations are specified, the corresponding estimation results are stored with the _Q_mi suffix, as described above.

Command-specific e() results that remain constant across imputations are also stored. Command-specific results that vary from imputation to imputation are posted as missing, and their names are stored in the corresponding macros e(names_vvl_mi), e(names_vvm_mi), and e(names_vvs_mi). For some command-specific e() scalars (see vs_names above), their values from each imputation are stored in a corresponding matrix with the _vs_mi suffix.


Barnard, J., and D. B. Rubin. 1999. Small-sample degrees of freedom with multiple imputation. Biometrika 86: 948-955.

Reiter, J. P. 2007. Small-sample degrees of freedom for multi-component significance tests with multiple imputation for missing data. Biometrika 94: 502-508.

Rubin, D. B. 1987. Multiple Imputation for Nonresponse in Surveys. New York: Wiley.

White, I. R., P. Royston, and A. M. Wood. 2011. Multiple imputation using chained equations: Issues and guidance for practice. Statistics in Medicine 30: 377-399.

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index