Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: adjusted r-squared, regress with pweight

From   Steve Samuels <>
Subject   Re: st: adjusted r-squared, regress with pweight
Date   Thu, 13 May 2010 10:14:23 -0400

Okay, I think that I've figured it out, and I apologize for the
confusion.   The adjusted R-square computed by  -reg [pw] -  corrects
the weighted estimates of the MSE and population variance by the same
corrections that would be appropriate for OLS regression on a sample
of the same size.  For the auto example with two covariates and one
intercept, ,  n = 69, and the corrections to MSE and variance are
(69/66) and (69/68), respectively.  With these correction, adjusted
R-square = 0.6218, the value given in e(r2_a).

These can be interpreted as follows:   The  unadjusted and adjusted
R-squared  are estimates of those that would have been reported if one
had  done OLS on a SRS of n = 69.  Adjusted R-squared is not, contrary
to  my original  belief,  a "population" estimate of anything.


On Thu, May 13, 2010 at 9:33 AM, Steve Samuels <> wrote:
>  I'm going to withdraw my conclusion that the adjusted R-square from
> reg [pw] is incorrect, until I can figure out how Stata calculates
> it..  I think that my hand calculation may be incorrect because the
> population definition of "mean square error' is not as clear to me as
> it was some months ago when I did it.  This just reinforces Stas's
> conclusion that these concepts are not too meaningful in a complex
> survey setting.
> Steve
> On Thu, May 13, 2010 at 8:59 AM, Steve Samuels <> wrote:
>> I think that the adjusted r-square reported after -reg- with [pweight]
>> is in error and that the displayed R-square is, in fact, adjusted
>> R-square.   I ran  three weighted regressions (code below)
>> I also directly calculated the adjusted r-square from svy: reg from
>> the weighted estimates of mean square error Ve and population variance
>>  V: adjusted R-square = 1- Ve/V.  ( agree with Stas that this has
>> little practical value when data are heteroskedastic and clustered--it
>> refers to
>> The results were:
>>                  Displayed R-square   Adjusted r-square:
>> reg [pw]     0.6300                   0.6188 (e(r2_a)
>> reg [fw]      0.6300                   0.6268 (displayed)
>> svy: reg     0.6300                   0.6300 (direct)
>> ************CODE*****************
>> sysuse auto,clear
>> reg mpg  length trunk [pw=rep78]
>> di e(r2_a)   //adjusted r-square
>> reg mpg  length trunk [fw=rep78]
>> svyset _n [pweight=rep78]
>> svy: reg mpg length trunk
>> **********************************
>> Steve
>> --Stas Kolenikov to statalist
>> Yes, David, it was asked before a number of times :)). Sum of squares
>> and all that ANOVA stuff assumes the normal regression model (i.e.,
>> the regression errors follow N(0,sigma^2) distribution). pweights
>> imply a probability sampling design, under which no distributional
>> assumptions are made, so the ANOVA table is inappropriate. You can
>> still compute all the sums of squares, of course, but they may not
>> have readily available population analogues; and the distributional
>> results for F-tests do not have the exact finite sample interpretation
>> anymore (although you'd still be able to get asymptotic Wald tests, I
>> imagine).
>> Likewise, you should not expect these things to show up when you
>> specify -robust- or -cluster- standard errors -- you know your data
>> are heteroskedastic, so why on earth would you ask for some sort of
>> averaged variance?
>> Steven Samuels
>> 18 Cantine's Island
>> Saugerties NY 12477
>> USA
>> Voice: 845-246-0774
>> Fax:    206-202-4783
> --
> Steven Samuels
> 18 Cantine's Island
> Saugerties NY 12477
> Voice: 845-246-0774
> Fax:    206-202-4783

Steven Samuels
18 Cantine's Island
Saugerties NY 12477
Voice: 845-246-0774
Fax:    206-202-4783

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index