Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | David Hoaglin <dchoaglin@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Relative Importance of predictors in regression |
Date | Tue, 5 Nov 2013 11:58:05 -0500 |
Dear Jorge, Thank you for the clarification. Being able to get all the necessary partial sums of squares from a single command is a big help. If we look at the regression coefficients, the step reg lhs rhs nicely illustrates the point that I have been making about interpretation. reg lhs rhs will give the partial SS for mpg, but the MS from that command may not have the correct degrees of freedom, because -reg- does not know about the degrees of freedom that have been partialled out of price. In deriving the part of R-squared attributed to a variable, you need to use the same denominator as R-squared itself uses. Setting aside the example and the calculations for a moment, what is the definition of "the shared variance of the [in]dependent variables"? Regards, David Hoaglin On Tue, Nov 5, 2013 at 11:00 AM, Jorge Eduardo Pérez Pérez <jorge_perez@brown.edu> wrote: > I should have been more specific, sorry, Ignore the SJ paper. > > Analysis of variance with continuous covariates and regression are > general linear models. All these models are equivalent: > > sysuse auto, clear > reg price mpg trunk > anova price c.mpg c.trunk > glm price mpg trunk, family(gaussian) link(identity) > > > The anova view of the model will yield the partial sums of squares > attributed to each regressor. In regression vocabulary, this would be > the model sum of squares of regressing the dependent variable on each > one of the independent variables, after partialling out the remaining > variables. For example, the MS attributed to mpg in the previous > regression could be obtained as follows, by first removing the > influence of trunk from both variables: > > reg price trunk > predict lhs, resid > reg mpg trunk > predict rhs, resid > reg lhs rhs > > From the anova view, dividing the partial SS of each variable over the > sum of the partial SS of the variable and the residual will give you > the part of the R squared attributed to that variable. It will be the > same as the R squared of the "partialled out" regression I showed > before. > The remainder will be the part attributed to the shared variance of > the dependent variables. > > * Part attributed to mpg > di e(ss_1)/(e(ss_1)+e(rss)) > * Part attributed to trunk > . di e(ss_2)/(e(ss_2)+e(rss)) > * Remainder (shared variance) > di e(r2) - e(ss_1)/(e(ss_1)+e(rss)) - e(ss_2)/(e(ss_2)+e(rss)) > > My suggestion from the anova paper is that the anova view calculates > all of the sums of squares at once, instead of having to calculate of > the partial regression sums of squares one by one. > > ----------------------------------------- > Jorge Eduardo Pérez Pérez > Graduate Student > Department of Economics > Brown University * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/