Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Austin Nichols <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Polynomial Fitting and RD Design |

Date |
Wed, 31 Aug 2011 22:59:59 -0400 |

Patrick Button <pbutton@uci.edu>: Try redefining your x so that the discontinuity is at zero. On Wed, Aug 31, 2011 at 9:54 PM, Patrick Button <pbutton@uci.edu> wrote: > Hello Stata users, > > I've been getting some unexpected Stata output when fitting polynomials > using a pretty simple OLS regression. > > I am replicating a regression discontinuity design paper (Lee, Moretti and > Butler 2004). The paper is here: > http://emlab.berkeley.edu/~moretti/final.pdf Code and data are here: > http://emlab.berkeley.edu/~moretti/data3.html (I am using enricoall2.dta). > > I need to run a regression that fits a 4th degree polynomial separately > for points of the running variable, x, below 0.5 and above 0.5. The > regression includes a dummy variable for if x >= 0.5 or not as well. If > there is a discontinuity at 0.5, then this is picked up in the coefficient > on that dummy variable. > > In this case the running variable is the vote share that the Democratic > candidate got in U.S. House of Representatives elections, including just > the Democratic and Republican votes. So x < 0.5 means a Republican won, > and >= 0.5 means a Democrat won. > > I would like to pool the data instead of running a separate regression for > each side. This is one of the recommended methods in the RD literature. > For some reason this method does not appear in the authors' code so I need > to do it myself. > > I'm running and setting up the regression as follows: > > *** > gen x = demvoteshare > > gen D = 1 if x >=0.5 > replace D = 0 if x < 0.5 > > *Left Side Polynomial > gen xa = (1-D)*x > gen x2a = (1-D)*x^2 > gen x3a = (1-D)*x^3 > gen x4a = (1-D)*x^4 > > *Right Side Polynomial > gen xb = D*x > gen x2b = D*x^2 > gen x3b = D*x^3 > gen x4b = D*x^4 > > regress realincome D xa x2a x3a x4a xb x2b x3b x4b > > *** > > Based on what the authors of the paper got, graphical analysis, and logic, > there should be no jump in realincome at 0.5. There is no reason why > income should be suddenly much different for districts that democrats just > barely won or just barely lost. If it is, this invalidates the regression > discontinuity design. So the coefficient on D should be statistically > insignificant. However, I get the following results: > > ------------------------------------------------------------------------------ > realincome | Coef. Std. Err. t P>|t| [95% Conf. > Interval] > -------------+---------------------------------------------------------------- > D | 497414.5 94802.12 5.25 0.000 311589 > 683240.1 > xa | 34396.25 27783.67 1.24 0.216 -20063.66 > 88856.17 > x2a | -22571.61 234577.9 -0.10 0.923 -482377.5 > 437234.3 > x3a | -429659.3 655505.3 -0.66 0.512 -1714542 > 855223.6 > x4a | 667813.9 598416.4 1.12 0.264 -505166.7 > 1840795 > xb | -2805647 534665.3 -5.25 0.000 -3853667 > -1757628 > x2b | 5828381 1112850 5.24 0.000 3647038 > 8009724 > x3b | -5281210 1012800 -5.21 0.000 -7266441 > -3295979 > x4b | 1754682 339914.5 5.16 0.000 1088402 > 2420963 > _cons | 31536.64 501.1422 62.93 0.000 30554.33 > 32518.95 > ------------------------------------------------------------------------------ > > I have no idea why D is statistically significant, and why only the > polynomial on the right side is statistically significant. This is not > just a problem with this regression. I get messed up results for every > regression I run that has a 4th degree polynomial on each side of 0.5. > > However, I do not get weird results like this when I use just one 4th > degree polynomial (one for the entire thing) with the D dummy. > > Does anyone know what I am doing wrong? I have no idea but I have a > feeling that i'm missing something obvious. > > Thank you very much for your time and consideration. > > -- > Patrick Button > Ph.D. Student > Department of Economics > University of California, Irvine > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Polynomial Fitting and RD Design***From:*"Patrick Button" <pbutton@uci.edu>

- Prev by Date:
**st: Polynomial Fitting and RD Design** - Previous by thread:
**st: Polynomial Fitting and RD Design** - Index(es):