Stata 15 help for teffects nnmatch

[TE] teffects nnmatch -- Nearest-neighbor matching


teffects nnmatch (ovar omvarlist) (tvar) [if] [in] [weight] [, stat options]

ovar is a binary, count, continuous, fractional, or nonnegative outcome of interest.

omvarlist specifies the covariates in the outcome model.

tvar must contain integer values representing the treatment levels. Only two treatment levels are allowed.

stat Description ------------------------------------------------------------------------- Stat ate estimate average treatment effect in population; the default atet estimate average treatment effect on the treated -------------------------------------------------------------------------

options Description ------------------------------------------------------------------------- Model nneighbor(#) specify number of matches per observation; default is nneighbor(1) biasadj(varlist) correct for large-sample bias using specified variables ematch(varlist) match exactly on specified variables

SE/Robust vce(vcetype) vcetype may be vce(robust [, nn(#)]); use robust Abadie-Imbens standard errors with # matches vce(iid); use default Abadie-Imbens standard errors

Reporting level(#) set confidence level; default is level(95) dmvariables display names of matching variables display_options control columns and column formats, row spacing, line width, display of omitted variables and base and empty cells, and factor-variable labeling

Advanced caliper(#) specify the maximum distance for which two observations are potential neighbors dtolerance(#) set maximum distance between individuals considered equal osample(newvar) newvar identifies observations that violate the overlap assumption control(# | label) specify the level of tvar that is the control tlevel(# | label) specify the level of tvar that is the treatment generate(stub) generate variables containing the observation numbers of the nearest neighbors metric(metric) select distance metric for covariates

coeflegend display legend instead of statistics -------------------------------------------------------------------------

metric Description ------------------------------------------------------------------------- mahalanobis inverse sample covariate covariance; the default ivariance inverse diagonal sample covariate covariance euclidean identity matrix matname user-supplied scaling matrix -------------------------------------------------------------------------

omvarlist may contain factor variables; see fvvarlists. by and statsby are allowed; see prefix. fweights are allowed; see weight. coeflegend does not appear in the dialog box. See [TE] teffects postestimation for features available after estimation.


Statistics > Treatment effects > Continuous outcomes > Nearest-neighbor matching

Statistics > Treatment effects > Binary outcomes > Nearest-neighbor matching

Statistics > Treatment effects > Count outcomes > Nearest-neighbor matching

Statistics > Treatment effects > Fractional outcomes > Nearest-neighbor matching

Statistics > Treatment effects > Nonnegative outcomes > Nearest-neighbor matching


teffects nnmatch estimates the average treatment effect and average treatment effect on the treated from observational data by nearest-neighbor matching. Nearest-neighbor matching estimators impute the missing potential outcome for each subject by using an average of the outcomes of similar subjects that receive the other treatment level. Similarity between subjects is based on a weighted function of the covariates for each observation. The treatment effect is computed by taking the average of the difference between the observed and imputed potential outcomes for each subject. teffects nnmatch accepts a continuous, binary, count, fractional, or nonnegative outcome.

See [TE] teffects intro or [TE] teffects intro advanced for more information about estimating treatment effects from observational data.


+-------+ ----+ Model +------------------------------------------------------------

nneighbor(#) specifies the number of matches per observation. The default is nneighbor(1). Each observation is matched with at least the specified number of observations from the other treatment level. nneighbor() must specify an integer greater than or equal to 1 but no larger than the number of observations in the smallest treatment group.

biasadj(varlist) specifies that a linear function of the specified covariates be used to correct for a large-sample bias that exists when matching on more than one continuous covariate. By default, no correction is performed.

Abadie and Imbens (2006, 2011) show that nearest-neighbor matching estimators are not consistent when matching on two or more continuous covariates and propose a bias-corrected estimator that is consistent. The correction term uses a linear function of variables specified in biasadj(); see example 3.

ematch(varlist) specifies that the variables in varlist match exactly. All variables in varlist must be numeric and may be specified as factors. teffects nnmatch exits with an error if any observations do not have the requested exact match.

+------+ ----+ Stat +-------------------------------------------------------------

stat is one of two statistics: ate or atet. ate is the default.

ate specifies that the average treatment effect be estimated.

atet specifies that the average treatment effect on the treated be estimated.

+-----------+ ----+ SE/Robust +--------------------------------------------------------

vce(vcetype) specifies the standard errors that are reported. By default, teffects nnmatch uses two matches in estimating the robust standard errors.

vce(robust [, nn(#)]) specifies that robust standard errors be reported and that the requested number of matches be used optionally.

vce(iid) specifies that standard errors for independently and identically distributed data be reported.

The standard derivative-based standard-error estimators cannot be used by teffects nnmatch, because these matching estimators are not differentiable. The implemented methods were derived by Abadie and Imbens (2006, 2011, 2012); see Methods and formulas.

As discussed in Abadie and Imbens (2008), bootstrap estimators do not provide reliable standard errors for the estimator implemented by teffects nnmatch.

+-----------+ ----+ Reporting +--------------------------------------------------------

level(#); see [R] estimation options.

dmvariables specifies that the matching variables be displayed.

display_options: noci, nopvalues, noomitted, vsquish, noemptycells, baselevels, allbaselevels, nofvlabel, fvwrap(#), fvwrapon(style), cformat(%fmt), pformat(%fmt), sformat(%fmt), and nolstretch; see [R] estimation options.

+----------+ ----+ Advanced +---------------------------------------------------------

caliper(#) specifies the maximum distance at which two observations are a potential match. By default, all observations are potential matches regardless of how dissimilar they are.

The distance is based on omvarlist. If an observation does not have at least nneighbor(#) matches, teffects nnmatch exits with an error message. Use option osample(newvar) to identify all observations that are deficient in matches.

dtolerance(#) specifies the tolerance used to determine exact matches. The default value is dtolerance(sqrt(c(epsdouble))).

Integer-valued variables are usually used for exact matching. The dtolerance() option is useful when continuous variables are used for exact matching.

osample(newvar) specifies that indicator variable newvar be created to identify observations that violate the overlap assumption. This variable will identify all observations that do not have at least nneighbor(#) matches in the opposite treatment group within caliper(#) (for metric() distance matching) or dtolerance(#) (for ematch(varlist) exact matches).

The vce(robust, nn(#)) option also requires at least # matches in the same treatment group within the distance specified by caliper(#) or within the exact matches specified by dtolerance(#).

The average treatment effect on the treated, option atet, using vce(iid) requires only nneighbor(#) control group matches for the treated group.

control(# | label) specifies the level of tvar that is the control. The default is the first treatment level. You may specify the numeric level # (a nonnegative integer) or the label associated with the numeric level. control() and tlevel() may not specify the same treatment level.

tlevel(# | label) specifies the level of tvar that is the treatment for the statistic atet. The default is the second treatment level. You may specify the numeric level # (a nonnegative integer) or the label associated with the numeric level. tlevel() may only be specified with statistic atet. tlevel() and control() may not specify the same treatment level.

generate(stub) specifies that the observation numbers of the nearest neighbors be stored in the new variables stub1, stub2, .... This option is required if you wish to perform postestimation based on the matching results. The number of variables generated may be more than nneighbors(#) because of tied distances. These variables may not already exist.

metric(metric) specifies the distance matrix used as the weight matrix in a quadratic form that transforms the multiple distances into a single distance measure; see Nearest-neighbor matching estimator in Methods and formulas of [TE] teffects nnmatch for details.

The following option is available with teffects nnmatch but is not shown in the dialog box:

coeflegend; see [R] estimation options.


Setup . webuse cattaneo2

Estimate the average treatment effect of mbsmoke on bweight . teffects nnmatch (bweight mage prenatal1 mmarried fbaby) (mbsmoke)

Refit the above model, but require exact matches on the binary variables . teffects nnmatch (bweight mage) (mbsmoke), ematch(prenatal1 mmarried fbaby) metric(euclidean)

Match on two continuous variables, mage and fage, and use the bias-adjusted estimator . teffects nnmatch (bweight mage fage) (mbsmoke), ematch(prenatal1 mmarried fbaby) biasadj(mage fage)

Video example

Treatment effects in Stata: Nearest-neighbor matching

Stored results

teffects nnmatch stores the following in e():

Scalars e(N) number of observations e(nj) number of observations for treatment level j e(k_levels) number of levels in treatment variable e(treated) level of treatment variable defined as treated e(control) level of treatment variable defined as control e(k_nneighbor) requested number of matches e(k_nnmin) minimum number of matches e(k_nnmax) maximum number of matches e(k_robust) matches for robust VCE

Macros e(cmd) teffects e(cmdline) command as typed e(depvar) name of outcome variable e(tvar) name of treatment variable e(emvarlist) exact match variables e(bavarlist) variables used in bias adjustment e(mvarlist) match variables e(subcmd) nnmatch e(metric) mahalanobis, ivariance, euclidean, or matrix matname e(stat) statistic estimated, ate or atet e(wtype) weight type e(wexp) weight expression e(title) title in estimation output e(tlevels) levels of treatment variable e(vce) vcetype specified in vce() e(vcetype) title used to label Std. Err. e(datasignature) the checksum e(datasignaturevars) variables used in calculation of checksum e(properties) b V e(estat_cmd) program used to implement estat e(predict) program used to implement predict e(marginsnotok) predictions disallowed by margins

Matrices e(b) coefficient vector e(V) variance-covariance matrix of the estimators

Functions e(sample) marks estimation sample


Abadie, A., and G. W. Imbens. 2006. Large sample properties of matching estimators for average treatment effects. Econometrica 74: 235-267.

--------. 2008. On the failure of the bootstrap for matching estimators. Econometrica 76: 1537-1557.

------. 2011. Bias-corrected matching estimators for average treatment effects. Journal of Business and Economic Statistics 29: 1-11.

------. 2012. Matching on the estimated propensity score. Harvard University and National Bureau of Economic Research.

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index