Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: deriving a bootstrap estimate of a difference between two weighted regressions

 From Stas Kolenikov To statalist@hsphsun2.harvard.edu Subject Re: st: deriving a bootstrap estimate of a difference between two weighted regressions Date Mon, 2 Aug 2010 09:11:50 -0500

```In what you describe below, the weights are not part of your data, but
rather are derived variables used as means to get the estimates (see
Steve's comments: aweights is not the right Stata concept to use here;
I completely agree with him). Hence, if you insist on the bootstrap,
an appropriate procedure that would replicate the analysis process on
the original sample would be:

1. take the bootstrap sample
3. compute the weights
4. compute the treatment effect estimate(s) using these weights
5. run 1-4 a large number of times.

As always with the bootstrap, I won't buy this procedure until I see
the proof of consistency published in Biometrika or J of Econometrics.
If you are just manipulating the means and other moments of the data
in the re-weighting procedure, you are probably OK; if you are doing
matching, you are certainly not OK, as matching is not a smooth
operation. If you have a complex sampling procedure, you can probably
just forget about getting the standard errors right as even the first
step, getting a bootstrap sample that would resemble the complex
sample at hand, is far from trivial. (In sum: the bootstrap is a great
method when you are conducting inference for the mean; everything else
is complicated.)

I would say that using the difference in weights that Steve suggested
is certainly an easier thing to do, although who knows how each
particular command will interpret the negative weights. It might also
be possible to get non-positive definite covariance matrix of the
coefficient estimates if weights are not all positive.

Also, the more sensitivity analyses you run, the far off your overall
type I error is going to be.

On Sun, Aug 1, 2010 at 12:39 PM, Ariel Linden, DrPH
<ariel.linden@gmail.com> wrote:
> There are at least two conceptual reasons why this process makes sense.
>
> First, assume a causal inference model which uses a weight (let's say an
> "average treatment on the treated" weight) to create balance on observed
> pre-intervention covariates (by setting the covariates to equal that of the
> treated group). Let's say the second model is identical but uses an "average
> treatment on controls" (ATC) weight. Assuming no unmeasured confounding, the
> treatment variable(s) from both models will provide the treatment effect
> estimate given the respective weighting purposes (holding covariates to
> represent treatment or control group characteristics). Thus, measuring the
> difference between the treatment effects in both models (which will need to
> have either bootstrapped or other adjustment to the SE) can serve as a
> sensitivity analysis (one of many approaches).
>
> Second, and in a similar manner, one can test the effect of a mediator using
> a weighting method for the original X-Y model, and second weight for the
> X-M-Y model. In both cases, different weights must be applied to two
> different regression models, and in both cases, the SE's will need to be
> adjusted. Weights are used in these models in a similar context to those in
> the first example - to control for confounding.
>
> By the way, a user written program called sgmediation (search sgmediation)
> does something similar to this but without the weights, so it may be
> possible to replicate many of the steps (or add weights?).

--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```