Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Interpreting coefficients of (logX)^2 variable in pooled OLS regression [SEC=UNOFFICIAL]


From   David Hoaglin <dchoaglin@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Interpreting coefficients of (logX)^2 variable in pooled OLS regression [SEC=UNOFFICIAL]
Date   Fri, 24 May 2013 22:11:04 -0400

Hi, Lucy.

Apart from reporting that the coefficient of ldistsq is positive, I
wonder whether it is necessary to give a separate interpretation to
that coefficient.  The relation is between lfare and a function of
ldist (after adjusting for differences among the years), and that
function involves both a quadratic term and a linear term.  That is,
the quadratic term and the linear term should be taken as a unit.  It
may be helpful to plot the fitted curves relating fare and dist (i.e.,
transform back from the log scale to the data scale), a separate curve
for each year.

I wonder whether the relation of lfare to ldist is truly quadratic.
You have enough data to try a piecewise-constant model: Split the
range of ldist into disjoint intervals, each containing, say at least
50 observations, choose one of those intervals as the reference
category, and create an indicator variable for each of the other
intervals.  Then use that set of indicator variables as the
predictors, instead of ldist and ldistsq.  If you then plot the
coefficient of each indicator variable against the value of ldist at
the midpoint of its interval, you can get a good impression of the
shape of the nonlinearity.  It might, for example, resemble a linear
spline.

Also, what do you see when you look at the relation of lfare to ldist
for each year separately?  Would it be helpful to include interactions
with year in your model?

David Hoaglin

On Fri, May 24, 2013 at 9:08 PM, DU,Lucy <Lucy.Du@deewr.gov.au> wrote:
> Unofficial
> Hi All
>
> I've been working on this research question using panel data set and am having difficulties interpreting my stata output.
>
> Dataset: airfare.dta available on http://www.stata.com/texts/eacsap/.
>
> Research question: How certain key variables affect airfares in the U.S. market. In the near future
>
> I ran a pooled OLS regression: regress lfare ldist ldistsq y98 y99 y00
>
> Where:
> lfare - log transformed airfare variable ldist - log transformed distance variable ldistsq - (ldist)^2 y98, y99, y00 - year dummy variables
>
> I understand how to interpret coefficients under a log-log transformed model, and coefficients where it's a quadratic model, but when it's a quadratic log transformed variable I'm completely stuck!
>
> My output is as follows:
>
> . regress lfare ldist ldistsq y98 y99 y00
>
>       Source |       SS       df       MS              Number of obs =    4596
> -------------+------------------------------           F(  5,  4590) =  581.09
>        Model |  339.211826     5  67.8423653           Prob > F      =  0.0000
>     Residual |  535.882547  4590   .11675001           R-squared     =  0.3876
> -------------+------------------------------           Adj R-squared =  0.3870
>        Total |  875.094374  4595  .190444913           Root MSE      =  .34169
>
> ------------------------------------------------------------------------------
>        lfare |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
> -------------+----------------------------------------------------------
> -------------+------
>        ldist |   -.783627   .1298635    -6.03   0.000    -1.038222   -.5290322
>      ldistsq |   .0897726   .0098112     9.15   0.000     .0705379    .1090072
>          y98 |    .024341   .0142555     1.71   0.088    -.0036067    .0522887
>          y99 |   .0350861   .0142555     2.46   0.014     .0071384    .0630338
>          y00 |   .0959191   .0142555     6.73   0.000     .0679714    .1238668
>        _cons |   6.239633   .4270934    14.61   0.000     5.402325    7.076942
> ------------------------------------------------------------------------------
>
> Can someone explain how I interpret the coefficients for ldist and ldistsq?
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index