Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Austin Nichols <[email protected]> |

To |
[email protected] |

Subject |
Re: st: Local Linear Regression for Regression Discontinuity Designs |

Date |
Mon, 23 May 2011 11:00:25 -0400 |

Alex Olssen <[email protected]>: You are right that I removed the kernel option from the new -rd- dated 20 March (update renamed the previous version to rd_obs and defined a new rd command) and introduced a bug. The older version rd_obs still works as expected, and allows a rectangular kernel. The bug introduced in the last update is that treatment is defined to be I(Z>0) instead of I(Z>=0). A new ado file dated today has been submitted to SSC; compare: sysuse auto, clear ren price y gen x = length - 193 gen z = (x >= 0) gen zs = (x > 0) gen z_x = z*x gen xlow=x*(1-z) gen xhigh=x*z lpoly y x if x<0, deg(1) ker(tri) bw(10) gen(L2) at(x) nogr lpoly y x if x>=0, deg(1) ker(tri) bw(10) gen(R2) at(x) nogr gen diff2 = R2 - L2 su diff2 if x == 0 g kwt=max(0,10-abs(x)) reg y z x z_x [pw=kwt] reg y z xlow xhigh [pw=kwt] reg y zs xlow xhigh [pw=kwt] rd y x, bwidth(10) On Mon, May 23, 2011 at 7:40 AM, Alex Olssen <[email protected]> wrote: > Dear Andreas, > > Estimation of the local linear regression model can be implemented by > OLS (restricting the subset of observations appropriately) IF you are > using the rectangular kernel. However Austin Nichol's latest version > of -rd- only allows estimation based on the triangular kernel - which > is optimal for boundary estimation - see the references in Imbens and > Lemieux 2009. > > As an aside, it would have been tremendously helpful if you had posted > some example code with your question. > > I compared OLS with dummies, lpoly, and an older version of Austin > Nichol's -rd- and got the same result in each case (all used the > rectangular kernel) > I tried again tonight but even after using the triangular kernel I > couldn't quite get the results from manual -lpoly- to match those of > Austin Nichol's -rd- > > I present an example using the auto dataset - just to show the code. > > * test rd > sysuse auto, clear > ren price y > gen x = length - 193 > gen z = (x >= 0) > gen z_x = z*x > > reg y x if x > -10 & x < 0 > reg y x if x >= 0 & x < 10 > reg y x z z_x if x > -10 & x < 10 > * OLS with dummies produces the same result as > * OLS on either side when the same bandwidths are used > > lpoly y x if x < 0, deg(1) ker(rec) bwidth(10) gen(L) at(x) nogr > lpoly y x if x >= 0, deg(1) ker(rec) bwidth(10) gen(R) at(x) nogr > gen diff = R - L > su diff if x == 0 > * OLS with dummies produces the same result as > * local linear regression when the rectangular kernel is used > > * note Austin Nichol's rd only allows use of the traingle kernel > * which is boundary optimal - see references in Imbens and Lemieux 2009 > lpoly y x if x < 0, deg(1) ker(tri) bwidth(10) gen(L2) at(x) nogr > lpoly y x if x >= 0, deg(1) ker(tri) bwidth(10) gen(R2) at(x) nogr > gen diff2 = R2 - L2 > su diff2 if x == 0 > rd y x, deg(1) bwidth(10) > > Perhaps Austin could comment on the difference? I expect I have made > an oversight somewhere. > > Kind regards, > > Alex > > On 23 May 2011 01:20, andreas nordset <[email protected]> wrote: >> Dear Statalist members, >> >> in a context in which individuals are eligible for a treatment if and >> only if they are aged above 50, I would like to implement a Regression >> Discontinuity Design to estimate the effect of the treatment on >> several outcomes, i.e. the difference between the average outcome just >> above the threshold and the average outcome just below the threshold, >> where these averages must be estimated. >> >> My impression is that the standard way of doing this is to use "Local >> Linear Regression". >> >> My understanding is that I can hence obtain the Reduced-Form effect by >> simply estimating: -reg outcome D50 age D50_age if >> inrange(age,50-h,50+h)- >> where D50 is a dummy for being aged above 50, D50_Age is the >> interaction of that dummy with age, and h is the bandwidth. >> Equivalently, I would obtain the Wald estimates with: -ivreg2 outcome >> age D50_age (treatment=D50) if inrange(age,50-h,50+h)-. >> Put differently, my understanding of "Local Linear Regression" is to >> estimate simple linear OLS regressions, but a separate line on each >> side and only "locally", i.e. using only observations from the >> interval (50-h,50+h). >> >> Yet when I do so, I obtain estimates that differ from those obtained >> using Austin Nichol's -rd- command that apparently uses the -lpoly- >> command for local linear regression. Does that mean that my >> understanding of LLR is incorrect, maybe because some more >> sophisticated weighting of observations is needed? In your view, is >> such a more sophisticated procedure needed, and if so what would be >> the problems with my very simple procedure? >> >> Thank you so much for your advice and best regards! * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Local Linear Regression for Regression Discontinuity Designs***From:*Austin Nichols <[email protected]>

**References**:**st: Local Linear Regression for Regression Discontinuity Designs***From:*andreas nordset <[email protected]>

**Re: st: Local Linear Regression for Regression Discontinuity Designs***From:*Alex Olssen <[email protected]>

- Prev by Date:
**st: RE: calculating mean without own observation** - Next by Date:
**Re: st: RE: Combining histograms** - Previous by thread:
**Re: st: Local Linear Regression for Regression Discontinuity Designs** - Next by thread:
**Re: st: Local Linear Regression for Regression Discontinuity Designs** - Index(es):