Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Relative Importance of predictors in regression


From   David Hoaglin <[email protected]>
To   [email protected]
Subject   Re: st: Relative Importance of predictors in regression
Date   Tue, 5 Nov 2013 11:58:05 -0500

Dear Jorge,

Thank you for the clarification.  Being able to get all the necessary
partial sums of squares from a single command is a big help.

If we look at the regression coefficients, the step
reg lhs rhs
nicely illustrates the point that I have been making about interpretation.

reg lhs rhs will give the partial SS for mpg, but the MS from that
command may not have the correct degrees of freedom, because -reg-
does not know about the degrees of freedom that have been partialled
out of price.

In deriving the part of R-squared attributed to a variable, you need
to use the same denominator as R-squared itself uses.

Setting aside the example and the calculations for a moment, what is
the definition of "the shared variance of the [in]dependent
variables"?

Regards,

David Hoaglin

On Tue, Nov 5, 2013 at 11:00 AM, Jorge Eduardo Pérez Pérez
<[email protected]> wrote:
> I should have been more specific, sorry, Ignore the SJ paper.
>
> Analysis of variance with continuous covariates and regression are
> general linear models. All these models are equivalent:
>
> sysuse auto, clear
> reg price mpg trunk
> anova price c.mpg c.trunk
> glm price mpg trunk, family(gaussian) link(identity)
>
>
> The anova view of the model will yield the partial sums of squares
> attributed to each regressor. In regression vocabulary, this would be
> the model sum of squares of regressing the dependent variable on each
> one of the independent variables, after partialling out the remaining
> variables. For example, the MS attributed to mpg in the previous
> regression could be obtained as follows, by first removing the
> influence of trunk from both variables:
>
> reg price trunk
> predict lhs, resid
> reg mpg trunk
> predict rhs, resid
> reg lhs rhs
>
> From the anova view, dividing the partial SS of each variable over the
> sum of the partial SS of the variable and the residual will give you
> the part of the R squared attributed to that variable. It will be the
> same as the R squared of the "partialled out" regression I showed
> before.
> The remainder will be the part attributed to the shared variance of
> the dependent variables.
>
> * Part attributed to mpg
> di e(ss_1)/(e(ss_1)+e(rss))
> * Part attributed to trunk
> . di e(ss_2)/(e(ss_2)+e(rss))
> * Remainder (shared variance)
> di e(r2) -  e(ss_1)/(e(ss_1)+e(rss)) - e(ss_2)/(e(ss_2)+e(rss))
>
> My suggestion from the anova paper is that the anova view calculates
> all of the sums of squares at once, instead of having to calculate of
> the partial regression sums of squares one by one.
>
> -----------------------------------------
> Jorge Eduardo Pérez Pérez
> Graduate Student
> Department of Economics
> Brown University

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index