Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# RE: st: SPLINE commands

 From Maarten buis <[email protected]> To [email protected] Subject RE: st: SPLINE commands Date Fri, 4 Feb 2011 16:42:12 +0000 (GMT)

```--- On Fri, 4/2/11, Ronald McDowell wrote:
> I'm not familiar with the concept of splines, and am
> looking for a gentle introduction to the area, in
> order to move beyond using quadratic and cubic etc
> terms in my models.

You could look at Marsh, Lawrence C. and David R. Cormier
(2002) "Spline Regression Models". Quantitative Applications
in the Social Sciences, nr. 137. Thousand Oacks: Sage.

I am actually moving back towards linear splines (from
more smooth restricted cubic, B-splines, etc.), as I
find linear splines to have a nicer balance between
interpretability of the parameters and flexibility of the
curve. Anyone who can interpret regular regression
parameters can also interpret the parameters of a linear
spline terms.

Consider the example below:
*--------------- begin example --------------------
sysuse auto, clear
mkspline mpg1 20 mpg2 = mpg
reg price mpg1 mpg2 foreign

// use adjust to predict price while keeping foreign at 0
adjust foreign = 1, by(mpg) generate(yhat)

// graph the predicted price against mpg
twoway line yhat mpg, sort
*---------------- end example ----------------------

The graph illustrates what happend, we basically have
two linear regression: one for cars with an mpg < 20
and one for cars with an mpg >20, and the regression
lines meet at mpg == 20. Moreover, the standard
parameterization, as implemented by -mkspline-, lets
you interpret the coefficients of these splines as
regular regression coefficients. So, for cars with
drop in price of 845 dollars, while for cars with
mpg > 20 the drop in price is a insignificant 70
dollars per mile per gallon.

As always there is a price that needs to be paid for
such convenient interpretability, and for linear
splines it is that sudded change in direction at the
knot and the linearity between the knots. Some people
find this not smooth enough or not realistic enough.
However, I am willing to sacrifice a lot of "realism"
of my model if that helps me to get across what I
have done to my data in order to arrive at my
conclussions. With linear splines one often must view
models as a useful summary/simplification of reality,
but isn't that what a model is supposed to be anyhow?

Having said all that, work has been done on making
the coefficients of other types of splines more
interpretable, but linear splines seems to me a
logical place to start before entering into more
complicated variations of it (and don't be afraid
to move back to linear splines once you have looked
at those variations).

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```