Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: adjusted r-squared, regress with pweight


From   Steve Samuels <sjsamuels@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: adjusted r-squared, regress with pweight
Date   Thu, 13 May 2010 08:59:07 -0400

I think that the adjusted r-square reported after -reg- with [pweight]
is in error and that the displayed R-square is, in fact, adjusted
R-square.   I ran  three weighted regressions (code below)

I also directly calculated the adjusted r-square from svy: reg from
the weighted estimates of mean square error Ve and population variance
 V: adjusted R-square = 1- Ve/V.  ( agree with Stas that this has
little practical value when data are heteroskedastic and clustered--it
refers to

The results were:
		  Displayed R-square   Adjusted r-square:
reg [pw]     0.6300                   0.6188 (e(r2_a)
reg [fw]      0.6300                   0.6268 (displayed)
svy: reg     0.6300                   0.6300 (direct)

************CODE*****************
sysuse auto,clear
reg mpg  length trunk [pw=rep78]
di e(r2_a)   //adjusted r-square
reg mpg  length trunk [fw=rep78]

svyset _n [pweight=rep78]
svy: reg mpg length trunk
**********************************

Steve

--Stas Kolenikov to statalist
Yes, David, it was asked before a number of times :)). Sum of squares
and all that ANOVA stuff assumes the normal regression model (i.e.,
the regression errors follow N(0,sigma^2) distribution). pweights
imply a probability sampling design, under which no distributional
assumptions are made, so the ANOVA table is inappropriate. You can
still compute all the sums of squares, of course, but they may not
have readily available population analogues; and the distributional
results for F-tests do not have the exact finite sample interpretation
anymore (although you'd still be able to get asymptotic Wald tests, I
imagine).

Likewise, you should not expect these things to show up when you
specify -robust- or -cluster- standard errors -- you know your data
are heteroskedastic, so why on earth would you ask for some sort of
averaged variance?
Steven Samuels
sjsamuels@gmail.com
18 Cantine's Island
Saugerties NY 12477
USA
Voice: 845-246-0774
Fax:    206-202-4783
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index