[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Matching, bootstrapping, sub-sampling

From   Joachim Wagner <>
Subject   st: Matching, bootstrapping, sub-sampling
Date   Wed, 22 Apr 2009 08:22:07 +0200

Dear List:

This is both a Stata related and a statistics question.

Short version:
If bootstrapping is invalid for estimating the standard errors for the ATT after nearest neighbor matching, does sub-sampling help, and if so, how?

Long version:
psmatch2 is rather popular among many of us (according to download statistics). Although the help file warns that it is "unclear whether the bootstrap is valid in this context" bootstrapping is popular to estimate the standard errors of the Average Treatment Effect on the Treated (ATT), too. But the times they are (expected to be) a-changin' : In the November 2008 issue of the Econometrica Alberto Abadie and Guido Imbens published a paper entitled "On the failure of the bootstrap for Matching Estimators" arguing that bootstrap standard errors are not valid as a basis for inference with simple nearest-neighbor matching estimators with replacement and a fixed number of neighbors. This result is popularized in a recent survey by Imbens and Jeffrey Wooldridge (Recent developments in the econometrics of program evaluation, published in the Journal of Economic Literature in March 2009). (For those of you who are working in different fields let me add that both journals are among the top journals in economics/econometrics.)

What is to be done? One suggestion found in both articles goes like this (Imbens and Wooldridge, p. 42): "In cases where bootstrapping is not valid, often subsampling (..) remains valid, but this has not been applied in practice." The authors refer to Dimitris N. Politis et al., Subsampling, New York: Springer 1999. Subsampling means using only a fraction, say, 75 percent, of the sample for a bootstrap draw.

Contrary to what Imbens and Wooldridge say there are some (working) papers using sub-sampling and bootstrapping to compute the standard errors of the ATT. They use ca. 75 percent of the sample in doing so. Nobody (as yet) told me why - the authors argue that others do so as well, or they do not reveal the somewhat secret formula, or rule of thumb, applied.

Two questions:

1. Can someone please explain in (more or less) plain English why subsampling is a solution?
2. How large should the subsamples be, and why?

Many thanks in advance for any comments etc.


*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index