Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Polynomial Fitting and RD Design


From   Maarten Buis <maartenlbuis@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Polynomial Fitting and RD Design
Date   Thu, 1 Sep 2011 11:24:20 +0200

On Thu, Sep 1, 2011 at 10:31 AM, Nick Cox wrote:
> Sure, but that still leaves the non-numeric issues. I guess the main
> issue is just reproducing behaviour with smooth curves, but what
> arguments justify any kind of quartic here?

No disagreement with you on that point. Actually I think that such
high degree polynomial is rather dangerous for this purpose as these
curves tend to move rather wildly away from the data at the extreme
ends of the curve, and in these models the break is such an extreme
end. As a consequence the break dummy may just capture this misfit to
the data rather than a real break. Patrick may want to consider a
fractional polynomial model instead. Below is an example on how to
estimate both models, the graph shows that the quartic curve does show
that wild behavior at the break, and the fractional polynomial model
shows that that is due to overfitting the curve as in this case two
linear curves will do just fine.

*--------------- begin example -----------------
sysuse uslifeexp, clear
drop if year == 1918 // Spanish flu pandemic
gen cyear = year - 1950 // center at break

// 4th degree polynomial
orthpoly cyear , gen(oyear*) degree(4)
gen D = cyear > 0 if year < .
forvalues i = 1/4 {
	gen oyear`i'l = (1-D)*oyear`i'
}
forvalues i = 1/4 {
	gen oyear`i'r = D*oyear`i'
}

// fit model
reg le oyear?? D

// predict outcome
predict pol

// fractional polynomial
gen cyearl = (1-D)*cyear
gen cyearr = D*cyear

// fit model
mfp, df(8) : reg le cyearl cyearr D

// predict outcome
predict mfp

// Graph the models
twoway line le pol mfp year,            ///
       xline(1950)                      ///
       lstyle(solid solid solid)        ///
       lcolor(black red blue)           ///
       legend(order( 1 "data"           ///
                     2 "quartic"        ///
                     3 "fractional"     ///
                       "polynomial"  ))
*---------------- end example ------------------
 (For more on examples I sent to the Statalist see:
http://www.maartenbuis.nl/example_faq )

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany


http://www.maartenbuis.nl
--------------------------
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index