Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: comparing coefficients across 2 models


From   "JVerkuilen (Gmail)" <jvverkuilen@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: comparing coefficients across 2 models
Date   Mon, 26 Nov 2012 11:43:41 -0500

On Mon, Nov 26, 2012 at 5:34 AM, Maarten Buis <maartenlbuis@gmail.com> wrote:
>Moreover, multicolinearity
>(high VIFs) is not a problem, it is just a description of an
>unfortunate state of the world, or at least, your data.

It is not a numerical problem like it was in the days before the
ubiquity of the QR algorithm. However, it messes up relationships and
standard errors of the regression coefficients, especially if the X
variables are set up in a non-optimal way.



>Having said that, centering is usually an excellent idea; it often
>helps interpretability of results and reduces numerical problems, so
>that is a win-win situation.

Agreed, almost always worth it to center all continuous predictors.
I'd even argue that centering categorical predictors is often useful,
though that's not done as often. This is especially true for variables
not primarily of substantive interest.



>Orthogonalising variables can reduce
>numerical problems, but often makes it harder to interpret the
>results, so that is a trade-off you need to make.

There are good reasons to orthogonalize and bad ones. Let me throw out
a clear pro example: You have a time variable and want to fit
polynomial terms up to quartic. You would always be better off turning
this into a set of orthogonal polynomials. (You would be even better
off fitting some other flexible basis, such as regression splines,
which are usually specified orthogonally.)

The clear con example would be to use principal components on the X
matrix of purely observed suffering from collinearity. This just leads
to uninterpretable models that probably don't generalize out of
sample.



>Estimating separate
>models as a "solution" to solve multicolinearity is a horrible idea;
>it does not make the multicolinearity go away, it just makes it harder
>to detect.

This ends up being a different model too.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index