Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Alex Olssen <alex.olssen@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Polynomial Fitting and RD Design |

Date |
Sat, 10 Sep 2011 23:06:53 +1200 |

Works a charm. Thanks Nick! On 10 September 2011 22:57, Nick Cox <njcoxstata@gmail.com> wrote: > This is Stata 7 syntax. It will work in > later versions with the prefix > > version 7: > > Nick > > On 10 Sep 2011, at 11:46, Alex Olssen <alex.olssen@gmail.com> wrote: > >> Hi, >> >> I have a question. It's not quite on this topic, but it is related to >> the replication of Lee, Moretti, and Butler. >> In the do-files found in the links in the original post are the >> following lines of code >> >> graph meanY100 fit1 fit2 int1U int1L int2U int2L dembin , >> l1(" ") l2("ADA Score, time t") b1(" ") t1(" ") t2(" ") >> b2("Democrat Vote Share, time t-1") xlabel(0,.5,1) ylabel (0,.5,1) >> title(" ") xline(.5) >> c(.lll[-]l[-]l[-]l[-]) s(oiiii) sort saving(`1'_reduced.gph, replace); >> translate `1'_reduced.gph `1'_reduced.eps, replace; >> >> I can't get this to work. I have never seen the graph command used >> like this before - I always used graph twoway, etc. >> >> In any case, the error I get is: >> "meanY100graph_g.new fit1 fit2 int1U int1L int2U int2L dembin ,: class >> member function not found" >> >> Can anyone help me with this? >> >> Cheers, >> Alex >> On 6 September 2011 07:40, Patrick Button <pbutton@uci.edu> wrote: >>> >>> Thank you for the feedback everyone. It has been extremely useful and now >>> I am not freaking out as much. >>> >>> First, i've changed x to x - 0.5 as per Austin Nichols' suggestion. This >>> makes interpretation easier. I should have done this earlier. >>> >>> I was thinking that my replication was going to involve critique Nick >>> Cox, >>> and I agree with you and others that the 4th order polynomials are >>> somewhat fishy. >>> >>> The weird thing about the paper is that the authors say that they are >>> using 4th degree polynomials on either side of the discontinuity, but >>> their graphs and/or code indicate that they are just using one polynomial >>> to fit the entire thing. Not sure why that is... So in trying to do the >>> 4th degree polynomial for each side on my own, i’ve run into this issue >>> of >>> results being weird. Now that I understand why it makes perfect sense. >>> >>> As for if the 4th degree polynomial is ideal, I would agree with all of >>> you that it probably is not. If one is going to go with polynomials, the >>> ideal degree depends on the bandwidth you use. Ariel Linden described >>> this >>> really well earlier. >>> >>> Larger bandwidths mean more precision, but more bias. Smaller bandwidths >>> (say only using data within +/- 2 percentage points of 50%) lead to the >>> opposite. Lee and Lemieux (2010) >>> (http://faculty.arts.ubc.ca/tlemieux/papers/RD_JEL.pdf) discuss that the >>> optimal polynomial degree is a function of the bandwidth. >>> >>> The ideal degree is determined by the Akaike Information Criterion (AIC). >>> I'm going to stick with the 4th degree polynomial (and the entire >>> dataset), then i'll try other polynomials and bandwidths, and then kernel >>> after that. I need to do the replication first, THEN I will critique that >>> by going with something more realistic. The -rd- package should be really >>> useful for that. Thanks so much for all the discussion about a more >>> realistic model. The key thing is that results should be robust to >>> several >>> different types of fitting and bandwidths, so long as they are realistic >>> in the first place. >>> >>> As for using orthog/orthpoly to generate orthogonal polynomials, I gave >>> that a shot. Thank you very much for the suggestion Martin Buis. >>> >>> I've done the orthogonalization two different ways. Both give different >>> results, neither of which mirror the results where I create the >>> polynomials in the regular fashion. I'm not sure which method is >>> "correct". I'm also unsure why the results are significantly different. >>> Any suggestions would be very helpful. >>> >>> Orthpoly # 1 uses orthpoly separately on each side of the discontinuity. >>> # >>> 2 does it for all the data. >>> >>> The code and output are below: >>> >>> ***** >>> >>> drop if demvoteshare==. >>> keep if realincome~=. >>> drop demvs2 demvs3 demvs4 >>> >>> gen double x = demvoteshare - 0.5 >>> >>> gen D = 1 if x >= 0 >>> replace D = 0 if x < 0 >>> >>> *Orthpoly #1 >>> >>> *Creating orthogonal polynomials separately for each side. >>> >>> orthpoly x if x < 0, deg(4) generate(demvsa demvs2a demvs3a demvs4a) >>> orthpoly x if x >= 0, deg(4) generate(demvsb demvs2b demvs3b demvs4b) >>> replace demvsa = 0 if demvsa==. >>> replace demvsb = 0 if demvsb==. >>> replace demvs2a = 0 if demvs2a==. >>> replace demvs2b = 0 if demvs2b==. >>> replace demvs3a = 0 if demvs3a==. >>> replace demvs3b = 0 if demvs3b==. >>> replace demvs4a = 0 if demvs4a==. >>> replace demvs4b = 0 if demvs4b==. >>> >>> replace demvsa = (1-D)*demvsa >>> replace demvs2a = (1-D)*demvs2a >>> replace demvs3a = (1-D)*demvs3a >>> replace demvs4a = (1-D)*demvs4a >>> >>> replace demvsb = D*demvsb >>> replace demvs2b = D*demvs2b >>> replace demvs3b = D*demvs3b >>> replace demvs4b = D*demvs4b >>> >>> regress realincome D demvsa demvs2a demvs3a demvs4a demvsb demvs2b >>> demvs3b >>> demvs4b >>> >>> *Orthpoly #2 >>> >>> orthpoly x, deg(4) generate (demvs demvs2 demvs3 demvs4) >>> >>> replace demvsa = (1-D)*demvs >>> replace demvs2a = (1-D)*demvs2 >>> replace demvs3a = (1-D)*demvs3 >>> replace demvs4a = (1-D)*demvs4 >>> >>> replace demvsb = D*demvs >>> replace demvs2b = D*demvs2 >>> replace demvs3b = D*demvs3 >>> replace demvs4b = D*demvs4 >>> >>> regress realincome D demvsa demvs2a demvs3a demvs4a demvsb demvs2b >>> demvs3b >>> demvs4b >>> >>> ***** >>> >>> And the results are: >>> >>> >>> Orthpoly # 1 >>> >>> >>> ------------------------------------------------------------------------------ >>> realincome | Coef. Std. Err. t P>|t| [95% Conf. >>> Interval] >>> >>> -------------+---------------------------------------------------------------- >>> D | -2597.064 140.5829 -18.47 0.000 -2872.626 >>> -2321.502 >>> demvsa | -853.4396 109.0927 -7.82 0.000 -1067.277 >>> -639.6025 >>> demvs2a | -941.1276 109.0927 -8.63 0.000 -1154.965 >>> -727.2905 >>> demvs3a | 593.9881 109.0927 5.44 0.000 380.151 >>> 807.8252 >>> demvs4a | 121.7433 109.0927 1.12 0.264 -92.09384 >>> 335.5804 >>> demvsb | -2006.552 88.66978 -22.63 0.000 -2180.357 >>> -1832.747 >>> demvs2b | -620.1632 88.66978 -6.99 0.000 -793.9685 >>> -446.3579 >>> demvs3b | -134.2237 88.66978 -1.51 0.130 -308.029 >>> 39.58156 >>> demvs4b | 457.7355 88.66978 5.16 0.000 283.9302 >>> 631.5407 >>> _cons | 32210.1 109.0927 295.25 0.000 31996.26 >>> 32423.93 >>> >>> ------------------------------------------------------------------------------ >>> >>> >>> Orthpoly # 2 >>> >>> >>> ------------------------------------------------------------------------------ >>> realincome | Coef. Std. Err. t P>|t| [95% Conf. >>> Interval] >>> >>> -------------+---------------------------------------------------------------- >>> D | -15904.18 22026.78 -0.72 0.470 -59079.79 >>> 27271.42 >>> demvsa | 56141.35 33816.59 1.66 0.097 -10143.95 >>> 122426.6 >>> demvs2a | 42328.68 25413.63 1.67 0.096 -7485.616 >>> 92142.98 >>> demvs3a | 19367.81 11950.96 1.62 0.105 -4057.754 >>> 42793.37 >>> demvs4a | 3038.492 2722.757 1.12 0.264 -2298.496 >>> 8375.481 >>> demvsb | -40636.36 7469.378 -5.44 0.000 -55277.4 >>> -25995.32 >>> demvs2b | 47190.86 9181.907 5.14 0.000 29193.03 >>> 65188.7 >>> demvs3b | -33596.74 6331.021 -5.31 0.000 -46006.43 >>> -21187.04 >>> demvs4b | 7983.823 1546.578 5.16 0.000 4952.31 >>> 11015.33 >>> _cons | 68128.44 21623.63 3.15 0.002 25743.08 >>> 110513.8 >>> >>> ------------------------------------------------------------------------------ >>> >>> The results using the earlier method (generating polynomials normally) >>> gives the following after I change x to x - 0.5: >>> >>> >>> ------------------------------------------------------------------------------ >>> realincome | Coef. Std. Err. t P>|t| [95% Conf. >>> Interval] >>> -------------+-------------------------------------------------- >>> >>> >>> >>> > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: Re: st: Polynomial Fitting and RD Design***From:*"Patrick Button" <pbutton@uci.edu>

**Re: Re: st: Polynomial Fitting and RD Design***From:*Alex Olssen <alex.olssen@gmail.com>

**Re: st: Polynomial Fitting and RD Design***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: Polynomial Fitting and RD Design** - Next by Date:
**Re: st: Creating a second output data set** - Previous by thread:
**Re: st: Polynomial Fitting and RD Design** - Next by thread:
**st: Re: distribution test** - Index(es):