Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Re: st: Polynomial Fitting and RD Design

From	Alex Olssen <[email protected]>
To	[email protected]
Subject	Re: Re: st: Polynomial Fitting and RD Design
Date	Sat, 10 Sep 2011 22:46:26 +1200
Hi,

I have a question.  It's not quite on this topic, but it is related to
the replication of Lee, Moretti, and Butler.
In the do-files found in the links in the original post are the
following lines of code

graph meanY100 fit1 fit2 int1U int1L int2U int2L dembin ,
l1(" ") l2("ADA Score, time t") b1(" ") t1(" ") t2(" ")
b2("Democrat Vote Share, time t-1")  xlabel(0,.5,1) ylabel (0,.5,1)
title(" ") xline(.5)
c(.lll[-]l[-]l[-]l[-]) s(oiiii) sort saving(`1'_reduced.gph, replace);
translate `1'_reduced.gph `1'_reduced.eps, replace;

I can't get this to work.  I have never seen the graph command used
like this before - I always used graph twoway, etc.

In any case, the error I get is:
"meanY100graph_g.new fit1 fit2 int1U int1L int2U int2L dembin ,: class
member function not found"

Can anyone help me with this?

Cheers,
Alex
On 6 September 2011 07:40, Patrick Button <[email protected]> wrote:
>
> Thank you for the feedback everyone. It has been extremely useful and now
> I am not freaking out as much.
>
> First, i've changed x to x - 0.5 as per Austin Nichols' suggestion. This
> makes interpretation easier. I should have done this earlier.
>
> I was thinking that my replication was going to involve critique Nick Cox,
> and I agree with you and others that the 4th order polynomials are
> somewhat fishy.
>
> The weird thing about the paper is that the authors say that they are
> using 4th degree polynomials on either side of the discontinuity, but
> their graphs and/or code indicate that they are just using one polynomial
> to fit the entire thing. Not sure why that is... So in trying to do the
> 4th degree polynomial for each side on my own, i’ve run into this issue of
> results being weird. Now that I understand why it makes perfect sense.
>
> As for if the 4th degree polynomial is ideal, I would agree with all of
> you that it probably is not. If one is going to go with polynomials, the
> ideal degree depends on the bandwidth you use. Ariel Linden described this
> really well earlier.
>
> Larger bandwidths mean more precision, but more bias. Smaller bandwidths
> (say only using data within +/- 2 percentage points of 50%) lead to the
> opposite. Lee and Lemieux (2010)
> (http://faculty.arts.ubc.ca/tlemieux/papers/RD_JEL.pdf) discuss that the
> optimal polynomial degree is a function of the bandwidth.
>
> The ideal degree is determined by the Akaike Information Criterion (AIC).
> I'm going to stick with the 4th degree polynomial (and the entire
> dataset), then i'll try other polynomials and bandwidths, and then kernel
> after that. I need to do the replication first, THEN I will critique that
> by going with something more realistic. The -rd- package should be really
> useful for that. Thanks so much for all the discussion about a more
> realistic model. The key thing is that results should be robust to several
> different types of fitting and bandwidths, so long as they are realistic
> in the first place.
>
> As for using orthog/orthpoly to generate orthogonal polynomials, I gave
> that a shot. Thank you very much for the suggestion Martin Buis.
>
> I've done the orthogonalization two different ways. Both give different
> results, neither of which mirror the results where I create the
> polynomials in the regular fashion. I'm not sure which method is
> "correct". I'm also unsure why the results are significantly different.
> Any suggestions would be very helpful.
>
> Orthpoly # 1 uses orthpoly separately on each side of the discontinuity. #
> 2 does it for all the data.
>
> The code and output are below:
>
> *****
>
> drop if demvoteshare==.
> keep if realincome~=.
> drop demvs2 demvs3 demvs4
>
> gen double x = demvoteshare - 0.5
>
> gen D = 1 if x >= 0
> replace D = 0 if x < 0
>
> *Orthpoly #1
>
> *Creating orthogonal polynomials separately for each side.
>
> orthpoly x if x < 0, deg(4) generate(demvsa demvs2a demvs3a demvs4a)
> orthpoly x if x >= 0, deg(4) generate(demvsb demvs2b demvs3b demvs4b)
> replace demvsa = 0 if demvsa==.
> replace demvsb = 0 if demvsb==.
> replace demvs2a = 0 if demvs2a==.
> replace demvs2b = 0 if demvs2b==.
> replace demvs3a = 0 if demvs3a==.
> replace demvs3b = 0 if demvs3b==.
> replace demvs4a = 0 if demvs4a==.
> replace demvs4b = 0 if demvs4b==.
>
> replace demvsa = (1-D)*demvsa
> replace demvs2a = (1-D)*demvs2a
> replace demvs3a = (1-D)*demvs3a
> replace demvs4a = (1-D)*demvs4a
>
> replace demvsb = D*demvsb
> replace demvs2b = D*demvs2b
> replace demvs3b = D*demvs3b
> replace demvs4b = D*demvs4b
>
> regress realincome D demvsa demvs2a demvs3a demvs4a demvsb demvs2b demvs3b
> demvs4b
>
> *Orthpoly #2
>
> orthpoly x, deg(4) generate (demvs demvs2 demvs3 demvs4)
>
> replace demvsa = (1-D)*demvs
> replace demvs2a = (1-D)*demvs2
> replace demvs3a = (1-D)*demvs3
> replace demvs4a = (1-D)*demvs4
>
> replace demvsb = D*demvs
> replace demvs2b = D*demvs2
> replace demvs3b = D*demvs3
> replace demvs4b = D*demvs4
>
> regress realincome D demvsa demvs2a demvs3a demvs4a demvsb demvs2b demvs3b
> demvs4b
>
> *****
>
> And the results are:
>
>
> Orthpoly # 1
>
> ------------------------------------------------------------------------------
>  realincome |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
> Interval]
> -------------+----------------------------------------------------------------
>           D |  -2597.064   140.5829   -18.47   0.000    -2872.626
> -2321.502
>      demvsa |  -853.4396   109.0927    -7.82   0.000    -1067.277
> -639.6025
>     demvs2a |  -941.1276   109.0927    -8.63   0.000    -1154.965
> -727.2905
>     demvs3a |   593.9881   109.0927     5.44   0.000      380.151
> 807.8252
>     demvs4a |   121.7433   109.0927     1.12   0.264    -92.09384
> 335.5804
>      demvsb |  -2006.552   88.66978   -22.63   0.000    -2180.357
> -1832.747
>     demvs2b |  -620.1632   88.66978    -6.99   0.000    -793.9685
> -446.3579
>     demvs3b |  -134.2237   88.66978    -1.51   0.130     -308.029
> 39.58156
>     demvs4b |   457.7355   88.66978     5.16   0.000     283.9302
> 631.5407
>       _cons |    32210.1   109.0927   295.25   0.000     31996.26
> 32423.93
> ------------------------------------------------------------------------------
>
>
> Orthpoly # 2
>
> ------------------------------------------------------------------------------
>  realincome |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
> Interval]
> -------------+----------------------------------------------------------------
>           D |  -15904.18   22026.78    -0.72   0.470    -59079.79
> 27271.42
>      demvsa |   56141.35   33816.59     1.66   0.097    -10143.95
> 122426.6
>     demvs2a |   42328.68   25413.63     1.67   0.096    -7485.616
> 92142.98
>     demvs3a |   19367.81   11950.96     1.62   0.105    -4057.754
> 42793.37
>     demvs4a |   3038.492   2722.757     1.12   0.264    -2298.496
> 8375.481
>      demvsb |  -40636.36   7469.378    -5.44   0.000     -55277.4
> -25995.32
>     demvs2b |   47190.86   9181.907     5.14   0.000     29193.03
> 65188.7
>     demvs3b |  -33596.74   6331.021    -5.31   0.000    -46006.43
> -21187.04
>     demvs4b |   7983.823   1546.578     5.16   0.000      4952.31
> 11015.33
>       _cons |   68128.44   21623.63     3.15   0.002     25743.08
> 110513.8
> ------------------------------------------------------------------------------
>
> The results using the earlier method (generating polynomials normally)
> gives the following after I change x to x - 0.5:
>
> ------------------------------------------------------------------------------
>  realincome |      Coef.   Std. Err.      t    P>|t|     [95% Conf.
> Interval]
> -------------+----------------------------------------------------------------
>           D |   1616.347   605.9781     2.67   0.008     428.5441
> 2804.149
>          xa |   23487.01   15519.78     1.51   0.130    -6933.964
> 53907.98
>         x2a |   334659.2   153845.2     2.18   0.030     33100.93
> 636217.5
>         x3a |   905964.7     546408     1.66   0.097    -165072.1
> 1977001
>         x4a |   667809.6   598416.3     1.12   0.264    -505170.9
> 1840790
>          xb |  -60833.88   12050.57    -5.05   0.000    -84454.71
> -37213.06
>         x2b |   538597.3   105340.2     5.11   0.000     332115.5
> 745079
>         x3b |   -1771874   334373.4    -5.30   0.000     -2427293
> -1116455
>         x4b |    1754710     339912     5.16   0.000      1088435
> 2420986
>       _cons |   31122.81   454.4263    68.49   0.000     30232.07
> 32013.55
> ------------------------------------------------------------------------------
>
> Any ideas would be great and I greatly appreciate everyone's assistance.
>
> --
> Patrick Button
> Ph.D. Student
> Department of Economics
> University of California, Irvine
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Follow-Ups:
- Re: st: Polynomial Fitting and RD Design
  - From: Nick Cox <[email protected]>
References:
- Re: Re: st: Polynomial Fitting and RD Design
  - From: "Patrick Button" <[email protected]>
Prev by Date: Re: st: Creating a second output data set
Next by Date: st: level of measurement for instrumental variable
Previous by thread: Re: Re: st: Polynomial Fitting and RD Design
Next by thread: Re: st: Polynomial Fitting and RD Design
Index(es):
- Date
- Thread