[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Testing for program effectiveness with heckman

From   "Austin Nichols" <>
Subject   Re: st: Testing for program effectiveness with heckman
Date   Fri, 30 May 2008 10:17:51 -0400


The position you espouse that "group comparisons are, if interpreted
correctly as the difference between groups, likely to be much less
biased than so called causal models" seems very strange to me.  If you
observe the treatment group having much better outcomes than the
control, but treatment is nonrandomly assigned, to what correct
interpretation of difference between groups would you assign the
difference between groups?  Suppose treatment is, instead of test
prep, the method used to treat kidney stones (PN or OS), and you
conclude that method PN produces better mean outcomes than OS in your
sample.  Would you now choose method PN if you discover you have a
kidney stone?  (I.e. imagine betting your own well-being on the result
of your hypothesis test.)  Now suppose I tell you that method PN is
inferior to OS in every subgroup, but mean outcomes for PN are better
because it is more often used in lower-risk cases?  If you can't
condition on the confounder, you will get the wrong answer from the
observational comparison.  For test prep and AFQT scores, something
very like this could be true, e.g. if the higher-ability folks engage
in test prep and test prep has zero true impact on scores, so the
measured effect of test prep will be positive.  In this case, an
instrumental variables or regression discontinuity approach at least
has a hope of getting consistent estimates of the treatment effect for
a subset of the population (see  A
randomized trial is clearly superior in most cases, however.

The kidney stone example is from:
Julious and Mullee. 1994. "Confounding and Simpson's paradox." BMJ 309
(6967): 1480–1481.
(and ignores the third treatment option which is cheaper and has
higher success rates, another ball of wax altogether).

On Fri, May 30, 2008 at 4:57 AM, Maarten buis <> wrote:
> --- "Riemer, Richard A CIV DMDC" <>
> wrote:
>> Maarten, Thank you for your reply.  I can see the distinction you are
>> making.  However, I wanted to use heckman because I thought it would
>> do a better job at explaining self-selection of test-preparation
>> rather than simple moderated regression where there could be
>> correlated errors between the two equations.  Following the example
>> of wage of women, we could say that 'afqt after test prep' is missing
>> on sample members who do not engage in test prep and that those
>> sample members would have scored lower than average if they would
>> have engaged in test prep.
> What you are running into is the fundamental problem with causal
> analysis. A causal effect can be thought of as a counterfactual
> experiment: You want to compare the test score of someone that prepared
> for the test with the test score of that same person when (s)he did not
> prepare for the test. The problem is that you cannot have a person that
> is both prepared and unprepared at the same time. An alternative way of
> thingking about this is that you are looking for another person that is
> the same in every respect except that that person did not prepare. Such
> a person obviously does not exist.
> The information we do have is a comparison of groups. You can use
> regression / ANOVA to control for other observed variables. An
> alternative method of controlling for observed characteristics is
> propensity score matching. Some would call these estimates biased
> because they expect that students who know that they won't do well are
> less likely to prepare, thus leading to an overestimation of the effect
> of preparetion; the students who did not prepare are expected to gain
> less from preparation then the students that did prepare. I would not
> call the group comparisons biased, but I would call them the empirical
> information that you use in your model. Once you have presented the
> empirical information, you can start adding assumptions, for instance
> by using -treatreg-, thus sacrificing empirical content to get closer
> to the theoretical concept you are interested in. If anything group
> comparisons are, if interpreted correctly as the difference between
> groups, likely to be much less biased than so called causal models.
> This is not because one type of model is inherently better than the
> other, but because the group comparison models try to solve a much
> easier problem.
> -- Maarten

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index