Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: Local Linear Regression for Regression Discontinuity Designs

 From andreas nordset To statalist@hsphsun2.harvard.edu Subject st: Local Linear Regression for Regression Discontinuity Designs Date Sun, 22 May 2011 15:20:34 +0200

```Dear Statalist members,

in a context in which individuals are eligible for a treatment if and
only if they are aged above 50, I would like to implement a Regression
Discontinuity Design to estimate the effect of the treatment on
several outcomes, i.e. the difference between the average outcome just
above the threshold and the average outcome just below the threshold,
where these averages must be estimated.

My impression is that the standard way of doing this is to use "Local
Linear Regression".

My understanding is that I can hence obtain the Reduced-Form effect by
simply estimating:  -reg outcome D50 age D50_age if
inrange(age,50-h,50+h)-
where D50 is a dummy for being aged above 50, D50_Age is the
interaction of that dummy with age, and h is the bandwidth.
Equivalently, I would obtain the Wald estimates with:  -ivreg2 outcome
age D50_age (treatment=D50) if inrange(age,50-h,50+h)-.
Put differently, my understanding of "Local Linear Regression" is to
estimate simple linear OLS regressions, but a separate line on each
side and only "locally", i.e. using only observations from the
interval (50-h,50+h).

Yet when I do so, I obtain estimates that differ from those obtained
using Austin Nichol's -rd- command that apparently uses the -lpoly-
command for local linear regression. Does that mean that my
understanding of LLR is incorrect, maybe because some more
sophisticated weighting of observations is needed? In your view, is
such a more sophisticated procedure needed, and if so what would be
the problems with my very simple procedure?