Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: SPLINE commands

From	Maarten buis <[email protected]>
To	[email protected]
Subject	RE: st: SPLINE commands
Date	Fri, 4 Feb 2011 16:42:12 +0000 (GMT)

--- On Fri, 4/2/11, Ronald McDowell wrote:
> I'm not familiar with the concept of splines, and am
> looking for a gentle introduction to the area, in
> order to move beyond using quadratic and cubic etc
> terms in my models.

You could look at Marsh, Lawrence C. and David R. Cormier
(2002) "Spline Regression Models". Quantitative Applications
in the Social Sciences, nr. 137. Thousand Oacks: Sage.

I am actually moving back towards linear splines (from 
more smooth restricted cubic, B-splines, etc.), as I 
find linear splines to have a nicer balance between 
interpretability of the parameters and flexibility of the 
curve. Anyone who can interpret regular regression 
parameters can also interpret the parameters of a linear 
spline terms. 

Consider the example below:
*--------------- begin example --------------------
sysuse auto, clear
mkspline mpg1 20 mpg2 = mpg
reg price mpg1 mpg2 foreign 

// use adjust to predict price while keeping foreign at 0
adjust foreign = 1, by(mpg) generate(yhat)

// graph the predicted price against mpg
twoway line yhat mpg, sort
*---------------- end example ----------------------

The graph illustrates what happend, we basically have 
two linear regression: one for cars with an mpg < 20
and one for cars with an mpg >20, and the regression
lines meet at mpg == 20. Moreover, the standard 
parameterization, as implemented by -mkspline-, lets
you interpret the coefficients of these splines as
regular regression coefficients. So, for cars with 
mpg < 20 and additional mile per gallon leads to a 
drop in price of 845 dollars, while for cars with
mpg > 20 the drop in price is a insignificant 70
dollars per mile per gallon.

As always there is a price that needs to be paid for 
such convenient interpretability, and for linear 
splines it is that sudded change in direction at the
knot and the linearity between the knots. Some people 
find this not smooth enough or not realistic enough. 
However, I am willing to sacrifice a lot of "realism" 
of my model if that helps me to get across what I 
have done to my data in order to arrive at my 
conclussions. With linear splines one often must view 
models as a useful summary/simplification of reality, 
but isn't that what a model is supposed to be anyhow?

Having said all that, work has been done on making 
the coefficients of other types of splines more 
interpretable, but linear splines seems to me a 
logical place to start before entering into more 
complicated variations of it (and don't be afraid
to move back to linear splines once you have looked
at those variations).  

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: SPLINE commands
  - From: Roger Newson <[email protected]>

References:
- RE: st: SPLINE commands
  - From: Ronald McDowell <[email protected]>

Prev by Date: [no subject]
Next by Date: RE: st: sequence of random values is repeated as I re-run code
Previous by thread: RE: st: SPLINE commands
Next by thread: Re: st: SPLINE commands
Index(es):
- Date
- Thread