Stata 15 help for mi impute chained

[MI] mi impute chained -- Impute missing values using chained equations

Syntax

Default specification of prediction equations, basic syntax

mi impute chained (uvmethod) ivars [= indepvars] [if] [weight] [, impute_options options]

Default specification of prediction equations, full syntax

mi impute chained lhs [= indepvars] [if] [weight] [, impute_options options]

Custom specification of prediction equations

mi impute chained lhsc [= indepvars] [if] [weight] [, impute_options options]

where lhs is lhs_spec [lhs_spec [...]] and lhs_spec is

(uvmethod [if] [, uvspec_options]) ivars

lhsc is lhsc_spec [lhsc_spec [...]] and lhsc_spec is

(uvmethod [if] [, include(xspec) omit(varlist) noimputed uvspec_options]) ivars

ivars (or newivar if uvmethod is intreg) are the names of the imputation variables.

uvspec_options are ascontinuous, noisily, and the method-specific options as described in the manual entry for each univariate imputation method.

The include(), omit(), and noimputed options allow you to customize the default prediction equations.

uvmethod Description ------------------------------------------------------------------------- regress linear regression for a continuous variable; [MI] mi impute regress pmm predictive mean matching for a continuous variable; [MI] mi impute pmm truncreg truncated regression for a continuous variable with a restricted range; [MI] mi impute truncreg intreg interval regression for a continuous partially observed (censored) variable; [MI] mi impute intreg logit logistic regression for a binary variable; [MI] mi impute logit ologit ordered logistic regression for an ordinal variable; [MI] mi impute ologit mlogit multinomial logistic regression for a nominal variable; [MI] mi impute mlogit poisson Poisson regression for a count variable; [MI] mi impute poisson nbreg negative binomial regression for an overdispersed count variable; [MI] mi impute nbreg -------------------------------------------------------------------------

options Description ------------------------------------------------------------------------- MICE options burnin(#) specify number of iterations for the burn-in period; default is burnin(10) chainonly perform chained iterations for the length of the burn-in period without creating imputations in the data augment perform augmented regression in the presence of perfect prediction for all categorical imputation variables noimputed do not include imputation variables in any prediction equation bootstrap estimate model parameters using sampling with replacement savetrace(...) save summaries of imputed values from each iteration in filename.dta

Reporting dryrun show conditional specifications without imputing data report show report about each conditional specification chaindots display dots as chained iterations are performed showevery(#) display intermediate results from every #th iteration showiter(numlist) display intermediate results from every iteration in numlist

Advanced orderasis impute variables in the specified order nomonotone impute using chained equations even when variables follow a monotone-missing pattern; default is to use monotone method nomonotonechk do not check whether variables follow a monotone-missing pattern ------------------------------------------------------------------------- You must mi set your data before using mi impute chained; see [MI] mi set. You must mi register ivars as imputed before using mi impute chained; see [MI] mi set. indepvars may contain factor variables; see fvvarlist. fweights, aweights (regress, pmm, truncreg, and intreg only), iweights, and pweights are allowed; see weight.

Menu

Statistics > Multiple imputation

Description

mi impute chained fills in missing values in multiple variables iteratively by using chained equations, a sequence of univariate imputation methods with fully conditional specification (FCS) of prediction equations. It accommodates arbitrary missing-value patterns. You can perform separate imputations on different subsets of the data by specifying the by() option. You can also account for frequency, analytic (with continuous variables only), importance, and sampling weights.

Options

+------+ ----+ Main +-------------------------------------------------------------

add(), replace, rseed(), double, by(); see [MI] mi impute.

The following options appear on a Specification dialog that appears when you click on the Create ... button on the Main tab. The include(), omit(), and noimputed options allow you to customize the default prediction equations.

include(xspec) specifies that xspec be included in prediction equations of all imputation variables corresponding to the current left-hand-side specification lhsc_spec. xspec includes complete variables and expressions of imputation variables bound in parentheses. If the noimputed option is specified within lhsc_spec or with mi impute chained, then xspec may also include imputation variables. xspec may contain factor variables; see fvvarlist.

omit(varlist) specifies that varlist be omitted from the prediction equations of all imputation variables corresponding to the current left-hand-side specification lhsc_spec. varlist may include complete variables or imputation variables. varlist may contain factor variables; see fvvarlist. In omit(), you should list variables to be omitted exactly as they appear in the prediction equation (abbreviations are allowed). For example, if variable x1 is listed as a factor variable, use omit(i.x1) to omit it from the prediction equation.

noimputed specifies that no imputation variables automatically be included in prediction equations of imputation variables corresponding to the current uvmethod.

uvspec_options are options specified within each univariate imputation method, uvmethod. uvspec_options include ascontinuous, noisily, and the method-specific options as described in the manual entry for each univariate imputation method.

ascontinuous specifies that categorical imputation variables corresponding to the current uvmethod be included as continuous in all prediction equations. This option is only allowed when uvmethod is logit, ologit, or mlogit.

noisily specifies that the output from the current univariate model fit to the observed data be displayed. This option is useful in combination with the showevery(#) or showiter(numlist) option to display results from a particular univariate imputation model for specific iterations.

+--------------+ ----+ MICE options +-----------------------------------------------------

burnin(#) specifies the number of iterations for the burn-in period for each chain (one chain per imputation). The default is burnin(10). This option specifies the number of iterations necessary for a chain to reach approximate stationarity or, equivalently, to converge to a stationary distribution. The required length of the burn-in period will depend on the starting values used and the missing-data patterns observed in the data. It is important to examine the chain for convergence to determine an adequate length of the burn-in period prior to obtaining imputations; see Convergence of MICE under Remarks and examples in [MI] mi impute chained. The provided default is what current literature recommends. However, you are responsible for determining that sufficient iterations are performed.

chainonly specifies that mi impute chained perform chained iterations for the length of the burn-in period and then stop. This option is useful in combination with savetrace() to examine the convergence of the method prior to imputation. No imputations are created when chainonly is specified, so add() or replace is not required with mi impute chained, chainonly and they are ignored if specified.

augment specifies that augmented regression be performed if perfect prediction is detected. By default, an error is issued when perfect prediction is detected. The idea behind the augmented-regression approach is to add a few observations with small weights to the data during estimation to avoid perfect prediction. See The issue of perfect prediction during imputation of categorical data under Remarks and examples in [MI] mi impute for more information. augment is not allowed with importance weights. This option is equivalent to specifying augment within univariate specifications of all categorical imputation methods: logit, ologit, and mlogit.

noimputed specifies that no imputation variables automatically be included in any of the prediction equations. This option is seldom used. This option is convenient if you wish to use different sets of imputation variables in all prediction equations. It is equivalent to specifying noimputed within all univariate specifications.

bootstrap specifies that posterior estimates of model parameters be obtained using sampling with replacement; that is, posterior estimates are estimated from a bootstrap sample. The default is to sample the estimates from the posterior distribution of model parameters or from the large-sample normal approximation of the posterior distribution. This option is useful when asymptotic normality of parameter estimates is suspect. This option is equivalent to specifying bootstrap within all univariate specifications.

savetrace(filename[, traceopts]) specifies to save the means and standard deviations of imputed values from each iteration to a Stata dataset called filename.dta. If the file already exists, the replace suboption specifies to overwrite the existing file. savetrace() is useful for monitoring convergence of the chained algorithm. This option cannot be combined with by().

traceopts are replace, double, and detail.

replace indicates that filename.dta be overwritten if it exists.

double specifies that the variables be stored as doubles, meaning 8-byte reals. By default, they are stored as floats, meaning 4-byte reals. See [D] data types.

detail specifies that additional summaries of imputed values including the smallest and the largest values and the 25th, 50th, and 75th percentiles are saved in filename.dta.

+-----------+ ----+ Reporting +--------------------------------------------------------

dots, noisily, nolegend; see [MI] mi impute. noisily specifies that the output from all univariate conditional models fit to the observed data be displayed. nolegend suppresses all imputation table legends that include a legend with the titles of the univariate imputation methods used, a legend about conditional imputation when conditional() is used within univariate specifications, and group legends when by() is specified.

dryrun specifies to show the conditional specifications that would be used to impute each variable without actually imputing data. This option is recommended for checking specifications of conditional models prior to imputation.

report specifies to show a report about each univariate conditional specification. This option, in a combination with dryrun, is recommended for checking specifications of conditional models prior to imputation.

chaindots specifies that all chained iterations be displayed as dots. An x is displayed for every failed iteration.

showevery(#) specifies that intermediate regression output be displayed from every #th iteration. This option requires noisily. If noisily is specified with mi impute chained, then the output from the specified iterations is displayed for all univariate conditional models. If noisily is used within a univariate specification, then the output from the corresponding univariate model from the specified iterations is displayed.

showiter(numlist) specifies that intermediate regression output be displayed for each iteration in numlist. This option requires noisily. If noisily is specified with mi impute chained, then the output from the specified iterations is displayed for all univariate conditional models. If noisily is used within a univariate specification, then the output from the corresponding univariate model from the specified iterations is displayed.

+----------+ ----+ Advanced +---------------------------------------------------------

force; see [MI] mi impute.

orderasis requests that the variables be imputed in the specified order. By default, variables are imputed in order from the most observed to the least observed.

nomonotone, a rarely used option, specifies not to use monotone imputation and to proceed with chained iterations even when imputation variables follow a monotone-missing pattern. mi impute chained checks whether imputation variables have a monotone missing-data pattern and, if they do, imputes them using the monotone method (without iteration). If nomonotone is used, mi impute chained imputes variables iteratively even if variables are monotone-missing.

nomonotonechk specifies not to check whether imputation variables follow a monotone-missing pattern. By default, mi impute chained checks whether imputation variables have a monotone missing-data pattern and, if they do, imputes them using the monotone method (without iteration). If nomonotonechk is used, mi impute chained does not check the missing-data pattern and imputes variables iteratively even if variables are monotone-missing. Once imputation variables are established to have an arbitrary missing-data pattern, this option may be used to avoid potentially time-consuming checks; the monotonicity check may be time consuming when a large number of variables is being imputed.

The following option is available with mi impute but is not shown in the dialog box:

noupdate; see [MI] noupdate option.

Examples: Default prediction equations

Setup . webuse mheart8s0

Describe mi data . mi describe

Examine missing-data patterns . mi misstable pattern

Impute bmi and age using linear regression . mi impute chained (regress) bmi age = attack smokes hsgrad female, add(10)

Impute bmi using predictive mean matching and age using linear regression . mi impute chained (pmm, knn(5)) bmi (regress) age = attack smokes hsgrad female, replace

Examples: Custom prediction equations

Setup . webuse mheart8s0, clear

Impute bmi using predictive mean matching and age using linear regression; omit hsgrad from the prediction equation for bmi . mi impute chained /// (pmm, knn(5) omit(hsgrad)) bmi /// (regress) age = attack smokes hsgrad female, add(10)

In the above, impute age using predictive mean matching and include age squared to the prediction equation for bmi . mi impute chained /// (pmm, knn(5) omit(hsgrad) include((age^2))) bmi /// (pmm, knn(5)) age = attack smokes hsgrad female, replace

Examples: Imputing on subsamples

In the previous example, impute bmi and age separately for males and females; display dots as imputations are performed . mi impute chained /// (pmm, knn(5) omit(hsgrad) include((age^2))) bmi /// (pmm, knn(5)) age = attack smokes hsgrad, replace by(female) dots

Examples: Conditional imputation

Setup . webuse mheart10s0, clear

Describe mi data . mi describe

Impute bmi and age using predictive mean matching, and smokes and hightar using logistic regression; impute hightar using only observations for which smokes==1 . mi impute chained /// (pmm, knn(5)) bmi /// (pmm, knn(5)) age /// (logit, cond(if smokes==1) omit(i.smokes)) hightar /// (logit) smokes = attack hsgrad female, add(10)

Stored results

mi impute chained stores the following in r():

Scalars r(M) total number of imputations r(M_add) number of added imputations r(M_update) number of updated imputations r(k_ivars) number of imputed variables r(burnin) number of burn-in iterations r(N_g) number of imputed groups (1 if by() is not specified)

Macros r(method) name of imputation method (chained) r(ivars) names of imputation variables r(uvmethods) names of univariate imputation methods r(init) type of initialization r(rngstate) random-number state used r(by) names of variables specified within by()

Matrices r(N) number of observations in imputation sample in each group (per variable) r(N_complete) number of complete observations in imputation sample in each group (per variable) r(N_incomplete) number of incomplete observations in imputation sample in each group (per variable) r(N_imputed) number of imputed observations in imputation sample in each group (per variable)


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index