`Hi Daniel and Dave`

`I suggest to look at things in a more abstract way.`

`In a linear model (Y = ax), we estimate a.`

`The coefficient a equals its marginal effect, that is dy/dx = a.`

`a indicates then how much y changes when we increase x by a small increment.`

`Now we estimate the non-linear model (with OLS): y = ax + bx2.`

`x2 is the squared term (x*x). We estimate coefficients a and b.`

`How much is the (total/compound) marginal effect of x? dy/dx = a + 2bx .`

`As one easily sees, the marginal effect dy/dx changes in x; dy/dx is not a constant any more.`

`So now what happens if we neglect the squared term coefficient (b) in interpreting ? This would be like assuming that the non-linear function was linear, which is wrong. This understates the true (total) marginal effect (a in place of a + 2bx).`

`On the other hand, interpreting only b neglects the 'linear' part of the function, and its contribution to the derivative dy/dx. And looking at b itself even understates the true contribution of x2 to the (total) marginal effect, which is 2bx.`

`In a sense, both coefficients a and b have to be interpreted jointly; separate they make little sense.`

`Hope this helps`

`Justina`

`-----owner-statalist@hsphsun2.harvard.edu schrieb: -----`

`An: statalist@hsphsun2.harvard.edu`

`Von: Daniel Feenberg <feenberg@nber.org>`

`Gesendet von: owner-statalist@hsphsun2.harvard.edu`

`Datum: 09.02.2011 12:56AM`

`Thema: Re: st: Non-linear regression: interpretation`

`On Tue, 8 Feb 2011, David Greenberg wrote:`

`> It is true that the quadratic term taken by itself can be hard to`

`> interpret. If the linear term is also in the equation, the coefficient`

`> for the quadratic term would seem to be an answer to a question that`

`> cannot have a meaningful answer, namely, how much the dependent variable`

`> changes in response to marginal change in the quadratic term, while`

`> holding the linear term constant. But it is impossible to hold x`

`> constant and allow x-squared to vary. However, the estimated`

`> coefficients of linear and quadratic terms together can be used to`

`> compute the estimated point at which the quadratic equation has a`

`> minimum or maximum, and that is something many researchers might want to`

`> know. One can also compute the value of the dependent variable at the`

`> minimum or maximum. David Greenberg, Sociology Department, New York`

`> University`

`If one takes the squared term about the mean of the variable, it`

`contributies nothing at the mean, leaving the linear term alone describing`

`the effect of changes in the variable about the mean. That can make quick`

`interpretations of the coeficients possible. For example, if the mean of x`

`is 7, then define`

`instead of using x**2. This won't change any predictions or t-stats, but`

`the slope dy/dx at x=7 will just be the coefficent on the linear term for`

`x - no need to fuss with calculating the contribution of the squared`

`term.`

`Daniel Feenberg`

`>`

`> ----- Original Message -----`

`> From: Maarten buis <maartenbuis@yahoo.co.uk>`

`> Date: Tuesday, February 8, 2011 4:55 am`

`> Subject: Re: st: Non-linear regression`

`> To: statalist@hsphsun2.harvard.edu`

`>`

`>`

`>> --- On Tue, 8/2/11, Hamizah Hassan wrote:`

`>>> I would like to run non-linear regression by including the`

`>>> linear and quadratic functions of the variable.`

`>>`

`>> Typically this is still refered to as a linear model, as the`

`>> model is still linear in the parameters.`

`>>`

`>>> I just realize that if the variable is in percentage, the`

`>>> quadratic figure is higher than the linear figure. However,`

`>>> if it is in decimal, it would be the other way around and`

`>>> definitely it will effect on the meaning of the results.`

`>>`

`>> The models are mathematically equivalent. You can see that`

`>> by looking at the predictions.`

`>>`

`>> Generally, it is hard to give a substantive interpretation to`

`>> a quadritic term, regardless of how you scaled the original`

`>> variable. If you care about interpreting the coefficients but`

`>> still want to allow for non-linear effects, then your best`

`>> guess is probably to use linear splines (which confusingly is`

`>> actually a non-linear function...)`

`>>`

`>> Consider the example below. The first part shows that the`

`>> two quadratic models result in the same predicted values. The`

`>> final part displays linear splines as an alternative. The final`

`>> graph shows that they result in fairly similar predictions, but`

`>> the spline terms can actually be interpreted: the parameter for`

`>> fuel_cons1 tells you that for cars with a fuel-consumption of`

`>> less than 12 liters/100km an additional liter/100km leads to a`

`>> non-significant price increase of 62$ (=.062*1000$). The`

`>> parameter for fuel_cons2 tells you that for cars with a fuel`

`>> consumption of more than 12 liters/100km an additional liter`

`>> per 100 kilometers will lead to a signinicant price increase of`

`>> 1011$ (=1.011*1000$).`

`>>`

`>> *----------------- begin example -----------------`

`>> //================================== first part`

`>> sysuse auto, clear`

`>>`

`>> // since I am European and the question is about`

`>> // interpretation I first convert mpg from miles`

`>> // per gallon to liter / 100 km and price in`

`>> // 1000 $`

`>>`

`>> gen fuel_cons = 1/mpg * 3.78541178 / 1.609344 *100`

`>> label var fuel_cons "fuel consumption (l/100km)"`

`>>`

`>> replace price = price / 1000`

`>> label var price "price (1000$)"`

`>>`

`>> // create a "proportion-like" variable`

`>> sum fuel_cons , meanonly`

`>> gen prop = ( fuel_cons - r(min) ) / ( r(max) - r(min) )`

`>>`

`>> // take a look at that new variable`

`>> spikeplot prop, ylab(0 1 2)`

`>>`

`>> // turn it into percentages`

`>> gen perc = prop*100`

`>> spikeplot perc, ylab(0 1 2)`

`>>`

`>> // add square terms using the new`

`>> // factor variable notation`

`>> reg price c.prop##c.prop`

`>> predict yhat_prop`

`>>`

`>> reg price c.perc##c.perc`

`>> predict yhat_perc`

`>>`

`>> // compare predicted values`

`>> twoway function identity = x, ///`

`>> range( 13 31 ) lcolor(gs8) || ///`

`>> scatter yhat_prop yhat_perc, ///`

`>> aspect(1) msymbol(Oh)`

`>>`

`>> //================================== final part`

`>> // alternative with interpretable parameters`

`>>`

`>> // create splines`

`>> mkspline fuel_cons1 12 fuel_cons2 = fuel_cons`

`>>`

`>> reg price fuel_cons1 fuel_cons2`

`>> predict yhat_spline`

`>>`

`>> twoway scatter price fuel_cons || ///`

`>> line yhat_prop yhat_spline fuel_cons, ///`

`>> sort ytitle("price (1000 {c S|})") ///`

`>> legend(order( 1 "observations" ///`

`>> 2 "prediction," ///`

`>> "quadratric" ///`

`>> 3 "prediction," ///`

`>> "spline" ))`

`>> *---------------- end example --------------`

`>> (For more on examples I sent to the Statalist see:`

`>> http://www.maartenbuis.nl/example_faq )`

`>>`

`>> Hope this helps,`

`>> Maarten`

`>>`

`>> --------------------------`

`>> Maarten L. Buis`

`>> Institut fuer Soziologie`

`>> Universitaet Tuebingen`

`>> Wilhelmstrasse 36`

`>> 72074 Tuebingen`

`>> Germany`

`>>`

`>> http://www.maartenbuis.nl`

`>> --------------------------`

`>>`

`>>`

`>>`

`>>`

`>> *`

`>> * For searches and help try:`

`>> * http://www.stata.com/help.cgi?search`

`>> * http://www.stata.com/support/statalist/faq`

`>> * http://www.ats.ucla.edu/stat/stata/`

`> *`

`> * For searches and help try:`

`> * http://www.stata.com/help.cgi?search`

`> * http://www.stata.com/support/statalist/faq`

`> * http://www.ats.ucla.edu/stat/stata/`

`>`

`*`

`* For searches and help try:`

`* http://www.stata.com/help.cgi?search`

`* http://www.stata.com/support/statalist/faq`

`* http://www.ats.ucla.edu/stat/stata/`