Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Relative Importance of predictors in regression
From
David Hoaglin <[email protected]>
To
[email protected]
Subject
Re: st: Relative Importance of predictors in regression
Date
Tue, 5 Nov 2013 11:58:05 -0500
Dear Jorge,
Thank you for the clarification. Being able to get all the necessary
partial sums of squares from a single command is a big help.
If we look at the regression coefficients, the step
reg lhs rhs
nicely illustrates the point that I have been making about interpretation.
reg lhs rhs will give the partial SS for mpg, but the MS from that
command may not have the correct degrees of freedom, because -reg-
does not know about the degrees of freedom that have been partialled
out of price.
In deriving the part of R-squared attributed to a variable, you need
to use the same denominator as R-squared itself uses.
Setting aside the example and the calculations for a moment, what is
the definition of "the shared variance of the [in]dependent
variables"?
Regards,
David Hoaglin
On Tue, Nov 5, 2013 at 11:00 AM, Jorge Eduardo Pérez Pérez
<[email protected]> wrote:
> I should have been more specific, sorry, Ignore the SJ paper.
>
> Analysis of variance with continuous covariates and regression are
> general linear models. All these models are equivalent:
>
> sysuse auto, clear
> reg price mpg trunk
> anova price c.mpg c.trunk
> glm price mpg trunk, family(gaussian) link(identity)
>
>
> The anova view of the model will yield the partial sums of squares
> attributed to each regressor. In regression vocabulary, this would be
> the model sum of squares of regressing the dependent variable on each
> one of the independent variables, after partialling out the remaining
> variables. For example, the MS attributed to mpg in the previous
> regression could be obtained as follows, by first removing the
> influence of trunk from both variables:
>
> reg price trunk
> predict lhs, resid
> reg mpg trunk
> predict rhs, resid
> reg lhs rhs
>
> From the anova view, dividing the partial SS of each variable over the
> sum of the partial SS of the variable and the residual will give you
> the part of the R squared attributed to that variable. It will be the
> same as the R squared of the "partialled out" regression I showed
> before.
> The remainder will be the part attributed to the shared variance of
> the dependent variables.
>
> * Part attributed to mpg
> di e(ss_1)/(e(ss_1)+e(rss))
> * Part attributed to trunk
> . di e(ss_2)/(e(ss_2)+e(rss))
> * Remainder (shared variance)
> di e(r2) - e(ss_1)/(e(ss_1)+e(rss)) - e(ss_2)/(e(ss_2)+e(rss))
>
> My suggestion from the anova paper is that the anova view calculates
> all of the sums of squares at once, instead of having to calculate of
> the partial regression sums of squares one by one.
>
> -----------------------------------------
> Jorge Eduardo Pérez Pérez
> Graduate Student
> Department of Economics
> Brown University
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/