# Re: st: Multicollinearity and Orthogonalization

 From Maarten buis To statalist@hsphsun2.harvard.edu Subject Re: st: Multicollinearity and Orthogonalization Date Wed, 20 Aug 2008 13:44:37 +0100 (BST)

```--- Erasmo Giambona <e.giambona@gmail.com> wrote:
> One thing remains unclear to me. I thought (perhaps wrongly) that
> one problem with multicollinearity is that if y1 and y2 are highly
> correlated (e.g., > 0.9), then their coefficient estimates in
> regression can get "artificially" alternate signs (e.g., + and - or
> vice versa). To me is not clear yet whether you suggest that Stata
> would not suffer from this problem or whether I should orthogonalize
> in this case.

If your coefficients are highly correlated then they will be measured
with low precision, in other words there will be large differences in
estimated coefficients across samples. This may mean that even the sign
changes from sample to sample. However, this is not a problem, as this
is exactly what the confidence interval is designed to warn you about.
It will tell you that if you where to draw a new sample you could just
as well find a coefficient that has the opposite sign as in your
current sample. So like any other analysis you will have to look at
both the point estimate and the confidence interval/standard error/
p-value. As long as you do that, you will draw the correct conclusion

I wouldn't put too much emphasis on the differences across packages,
these differences are not in the statistics, but in the way they
interact with the computer: A statistical technique involves
computations, and while doing the computations a computer needs to
store numbers, and you cannot store numbers with a infinite number of
digits, so you will have to round while computing. You can do that
smartly or less smartly, and when you do that less smartly your
computations can go wrong. Very high multicolinearity can be situation
where doing the computations less smartly can cause problems.
Fortunately Stata does these computations smartly. Most other
statistical packages are pretty good as well, and I have heard nothing
The packages I would really worry about are the ones that aren't
specifically designed for statistical analysis, like Microsoft Excel.

-- Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

Send instant messages to your online friends http://uk.messenger.yahoo.com
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```