Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steve Samuels <sjsamuels@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: adjusted r-squared, regress with pweight |

Date |
Thu, 13 May 2010 08:59:07 -0400 |

I think that the adjusted r-square reported after -reg- with [pweight] is in error and that the displayed R-square is, in fact, adjusted R-square. I ran three weighted regressions (code below) I also directly calculated the adjusted r-square from svy: reg from the weighted estimates of mean square error Ve and population variance V: adjusted R-square = 1- Ve/V. ( agree with Stas that this has little practical value when data are heteroskedastic and clustered--it refers to The results were: Displayed R-square Adjusted r-square: reg [pw] 0.6300 0.6188 (e(r2_a) reg [fw] 0.6300 0.6268 (displayed) svy: reg 0.6300 0.6300 (direct) ************CODE***************** sysuse auto,clear reg mpg length trunk [pw=rep78] di e(r2_a) //adjusted r-square reg mpg length trunk [fw=rep78] svyset _n [pweight=rep78] svy: reg mpg length trunk ********************************** Steve --Stas Kolenikov to statalist Yes, David, it was asked before a number of times :)). Sum of squares and all that ANOVA stuff assumes the normal regression model (i.e., the regression errors follow N(0,sigma^2) distribution). pweights imply a probability sampling design, under which no distributional assumptions are made, so the ANOVA table is inappropriate. You can still compute all the sums of squares, of course, but they may not have readily available population analogues; and the distributional results for F-tests do not have the exact finite sample interpretation anymore (although you'd still be able to get asymptotic Wald tests, I imagine). Likewise, you should not expect these things to show up when you specify -robust- or -cluster- standard errors -- you know your data are heteroskedastic, so why on earth would you ask for some sort of averaged variance? Steven Samuels sjsamuels@gmail.com 18 Cantine's Island Saugerties NY 12477 USA Voice: 845-246-0774 Fax: 206-202-4783 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: adjusted r-squared, regress with pweight***From:*Steve Samuels <sjsamuels@gmail.com>

**References**:**st: adjusted r-squared, regress with pweight***From:*David Kantor <kantor.d@att.net>

**Re: st: adjusted r-squared, regress with pweight***From:*Stas Kolenikov <skolenik@gmail.com>

- Prev by Date:
**RE: AW: st: sort of standardization** - Next by Date:
**st: RE: RE: AW: RE: AW: RE: Correct labeling in egenmore axis()?** - Previous by thread:
**Re: st: adjusted r-squared, regress with pweight** - Next by thread:
**Re: st: adjusted r-squared, regress with pweight** - Index(es):