Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: comparing coefficients across 2 models

 From "JVerkuilen (Gmail)" To statalist@hsphsun2.harvard.edu Subject Re: st: comparing coefficients across 2 models Date Mon, 26 Nov 2012 11:43:41 -0500

```On Mon, Nov 26, 2012 at 5:34 AM, Maarten Buis <maartenlbuis@gmail.com> wrote:
>Moreover, multicolinearity
>(high VIFs) is not a problem, it is just a description of an
>unfortunate state of the world, or at least, your data.

It is not a numerical problem like it was in the days before the
ubiquity of the QR algorithm. However, it messes up relationships and
standard errors of the regression coefficients, especially if the X
variables are set up in a non-optimal way.

>Having said that, centering is usually an excellent idea; it often
>helps interpretability of results and reduces numerical problems, so
>that is a win-win situation.

Agreed, almost always worth it to center all continuous predictors.
I'd even argue that centering categorical predictors is often useful,
though that's not done as often. This is especially true for variables
not primarily of substantive interest.

>Orthogonalising variables can reduce
>numerical problems, but often makes it harder to interpret the
>results, so that is a trade-off you need to make.

There are good reasons to orthogonalize and bad ones. Let me throw out
a clear pro example: You have a time variable and want to fit
polynomial terms up to quartic. You would always be better off turning
this into a set of orthogonal polynomials. (You would be even better
off fitting some other flexible basis, such as regression splines,
which are usually specified orthogonally.)

The clear con example would be to use principal components on the X
matrix of purely observed suffering from collinearity. This just leads
to uninterpretable models that probably don't generalize out of
sample.

>Estimating separate
>models as a "solution" to solve multicolinearity is a horrible idea;
>it does not make the multicolinearity go away, it just makes it harder
>to detect.

This ends up being a different model too.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```