Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Maarten buis <maartenbuis@yahoo.co.uk> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Non-linear regression |
Date | Tue, 8 Feb 2011 09:54:22 +0000 (GMT) |
--- On Tue, 8/2/11, Hamizah Hassan wrote: > I would like to run non-linear regression by including the > linear and quadratic functions of the variable. Typically this is still refered to as a linear model, as the model is still linear in the parameters. > I just realize that if the variable is in percentage, the > quadratic figure is higher than the linear figure. However, > if it is in decimal, it would be the other way around and > definitely it will effect on the meaning of the results. The models are mathematically equivalent. You can see that by looking at the predictions. Generally, it is hard to give a substantive interpretation to a quadritic term, regardless of how you scaled the original variable. If you care about interpreting the coefficients but still want to allow for non-linear effects, then your best guess is probably to use linear splines (which confusingly is actually a non-linear function...) Consider the example below. The first part shows that the two quadratic models result in the same predicted values. The final part displays linear splines as an alternative. The final graph shows that they result in fairly similar predictions, but the spline terms can actually be interpreted: the parameter for fuel_cons1 tells you that for cars with a fuel-consumption of less than 12 liters/100km an additional liter/100km leads to a non-significant price increase of 62$ (=.062*1000$). The parameter for fuel_cons2 tells you that for cars with a fuel consumption of more than 12 liters/100km an additional liter per 100 kilometers will lead to a signinicant price increase of 1011$ (=1.011*1000$). *----------------- begin example ----------------- //================================== first part sysuse auto, clear // since I am European and the question is about // interpretation I first convert mpg from miles // per gallon to liter / 100 km and price in // 1000 $ gen fuel_cons = 1/mpg * 3.78541178 / 1.609344 *100 label var fuel_cons "fuel consumption (l/100km)" replace price = price / 1000 label var price "price (1000$)" // create a "proportion-like" variable sum fuel_cons , meanonly gen prop = ( fuel_cons - r(min) ) / ( r(max) - r(min) ) // take a look at that new variable spikeplot prop, ylab(0 1 2) // turn it into percentages gen perc = prop*100 spikeplot perc, ylab(0 1 2) // add square terms using the new // factor variable notation reg price c.prop##c.prop predict yhat_prop reg price c.perc##c.perc predict yhat_perc // compare predicted values twoway function identity = x, /// range( 13 31 ) lcolor(gs8) || /// scatter yhat_prop yhat_perc, /// aspect(1) msymbol(Oh) //================================== final part // alternative with interpretable parameters // create splines mkspline fuel_cons1 12 fuel_cons2 = fuel_cons reg price fuel_cons1 fuel_cons2 predict yhat_spline twoway scatter price fuel_cons || /// line yhat_prop yhat_spline fuel_cons, /// sort ytitle("price (1000 {c S|})") /// legend(order( 1 "observations" /// 2 "prediction," /// "quadratric" /// 3 "prediction," /// "spline" )) *---------------- end example -------------- (For more on examples I sent to the Statalist see: http://www.maartenbuis.nl/example_faq ) Hope this helps, Maarten -------------------------- Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl -------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/