Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Non-linear regression

From   Maarten buis <>
Subject   Re: st: Non-linear regression
Date   Tue, 8 Feb 2011 09:54:22 +0000 (GMT)

--- On Tue, 8/2/11, Hamizah Hassan wrote:
> I would like to run non-linear regression by including the
> linear and quadratic functions of the variable. 

Typically this is still refered to as a linear model, as the
model is still linear in the parameters.

> I just realize that if the variable is in percentage, the
> quadratic figure is higher than the linear figure. However,
> if it is in decimal, it would be the other way around and
> definitely it will effect on the meaning of the results. 

The models are mathematically equivalent. You can see that
by looking at the predictions. 

Generally, it is hard to give a substantive interpretation to
a quadritic term, regardless of how you scaled the original 
variable. If you care about interpreting the coefficients but 
still want to allow for non-linear effects, then your best 
guess is probably to use linear splines (which confusingly is 
actually a non-linear function...)

Consider the example below. The first part shows that the
two quadratic models result in the same predicted values. The
final part displays linear splines as an alternative. The final
graph shows that they result in fairly similar predictions, but
the spline terms can actually be interpreted: the parameter for
fuel_cons1 tells you that for cars with a fuel-consumption of 
less than 12 liters/100km an additional liter/100km leads to a 
non-significant price increase of 62$ (=.062*1000$). The 
parameter for fuel_cons2 tells you that for cars with a fuel 
consumption of more than 12 liters/100km an additional liter
per 100 kilometers will lead to a signinicant price increase of 
1011$ (=1.011*1000$).

*----------------- begin example -----------------
//================================== first part
sysuse auto, clear

// since I am European and the question is about
// interpretation I first convert mpg from miles
// per gallon to liter / 100 km and price in 
// 1000 $

gen fuel_cons = 1/mpg * 3.78541178 / 1.609344 *100
label var fuel_cons "fuel consumption (l/100km)"

replace price = price / 1000
label var price "price (1000$)"

// create a "proportion-like" variable
sum fuel_cons , meanonly
gen prop = ( fuel_cons - r(min) ) / ( r(max) - r(min) )

// take a look at that new variable
spikeplot prop, ylab(0 1 2)

// turn it into percentages
gen perc = prop*100
spikeplot perc, ylab(0 1 2)

// add square terms using the new
// factor variable notation
reg price c.prop##c.prop
predict yhat_prop

reg price c.perc##c.perc
predict yhat_perc

// compare predicted values
twoway function identity = x,        ///
       range( 13 31 ) lcolor(gs8) || ///
       scatter yhat_prop yhat_perc,  ///
	   aspect(1) msymbol(Oh)

//================================== final part	   
// alternative with interpretable parameters

// create splines
mkspline fuel_cons1 12 fuel_cons2 = fuel_cons

reg price fuel_cons1 fuel_cons2
predict yhat_spline

twoway scatter price fuel_cons  ||           ///
       line yhat_prop yhat_spline fuel_cons, ///
       sort ytitle("price (1000 {c S|})")    ///
       legend(order( 1 "observations"        ///
                     2 "prediction,"         ///
                       "quadratric"          ///
                     3 "prediction,"         ///
                       "spline" ))       
*---------------- end example --------------
(For more on examples I sent to the Statalist see: )

Hope this helps,

Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen


*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index