Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Multicollinearity and Orthogonalization


From   jverkuilen <jverkuilen@gc.cuny.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Multicollinearity and Orthogonalization
Date   Wed, 20 Aug 2008 20:10:03 -0400

Maarten buis <maartenbuis@yahoo.co.uk> wrote:

> If your coefficients are highly correlated then they will be measured
> with low precision, in other words there will be large differences in
> estimated coefficients across samples. This may mean that even the sign
> changes from sample to sample. However, this is not a problem, as this
> is exactly what the confidence interval is designed to warn you about.
> It will tell you that if you where to draw a new sample you could just
> as well find a coefficient that has the opposite sign as in your
> current sample. So like any other analysis you will have to look at
> both the point estimate and the confidence interval/standard error/
> p-value. As long as you do that, you will draw the correct conclusion
> from your data.

One way of looking at is that for two predictors correlated at 0.9, there's not much value added in the two variables compared to just one. Orthogonalization will, essentially, average the two for one component and take what's left orthogonal to the average for the other... which here isn't much.   


>I have heard nothing
> about SAS that would make me particularly worried about that package.

SAS is solid. 

Any program that naively solves the normal equations suffers from "condition squaring" which---surprisingly enough--squares the condition number. That is bad as the condition number indicates how bad the multicollinearity is, the higher the worse, with 1 being optimal. If you have low condition numbers, it won't matter. That usually only happens in designed experiments where yo can almost always get a 1 thru a balanced design. Any reasonable package should not solve the normal equations directly. The "Cadillac" solution is by the QR algorithm.    


> The packages I would really worry about are the ones that aren't
> specifically designed for statistical analysis, like Microsoft Excel.>

Wise man. :)

JV

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index