Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]


From   Richard Williams <[email protected]>
To   [email protected]
Subject   Re: st: R-SQUARED AND XTGEE
Date   Tue, 28 Oct 2003 21:35:23 -0500

At 01:46 AM 10/29/2003 +0000, Clive Nicholas wrote:

(a) Whatever is judged to be the 'best' measure of R^2, one *must* keep in
mind that (i) high levels of intercorrelation between X-variables inflate
R^2 to artifically-high levels; and (ii) models deploying aggregate-level
data with large spatial units of analysis inevitably have knock-on
(upward) effects on R^2, regardless of its measurement;
I'm not sure I understand (a)(i) -- Two Xs could be perfectly correlated with each other, and yet both could have zero correlation with Y. Can you elaborate or give an example?

(b) Why should *anybody* attempt to build a regression model that hopes to
 produce an R^2 of 100%? Anybody with half a brain on these matters will
tell you that if your model has yielded a 'perfect' R^2, something is
wrong (probably multicollinearity among two or three X-variables). When
will people learn to love *low* levels of R^2? Low levels means there is
more to explain, and thus stretches our academic imaginations by providing
us with more challenges as to what the missing key factors might be.
I agree that I would certainly be suspicious of a perfect R^2. But, there may not be anything else to explain -- it could just be that some percentage of what happens in the world is due to random, chance factors. Also, while you are correct in saying that in practice there will always be more to explain, an implication of that is that our models inevitably suffer from omitted variable bias -- which probably means that, not only have we failed to consider the effects of variables not included in the model, we have likely mis-estimated the effects of the variables we do have. So, I think an ideal goal is to make R^2 as high as it should be, but no higher, i.e. get a perfectly specified model, and if that produces an R^2 of .10 then so be it. If by some wild chance I ever did explain all the variability in a variable, I'm sure I could find some new variable to move on to, so I wouldn't be too worried about running out of challenges!

If only social scientists, psychologists and economists alike simply
focused on the theoretical and empirical validity and reliability of their
variables and modelled social reality as accurately as possible in order
to test theories about human behaviour, then this will tell us more than
what R^2 tells us about *anything!* :-)
I agree with that. The goal is correct model specification and R^2 may tell you little about how well you have met that goal. But if you do all these things you may find that a nice large R^2 comes along as an added bonus.

Richard Williams, Associate Professor
OFFICE: (574)631-6668, (574)631-6463
FAX: (574)288-4373
HOME: (574)289-5227
EMAIL: [email protected]
WWW (personal):
WWW (department):

* For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index