Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# st: Can multicollinearity problems be resolved by using residuals from another regression?

 From "A. Shaul" <3c5171@gmail.com> To statalist@hsphsun2.harvard.edu Subject st: Can multicollinearity problems be resolved by using residuals from another regression? Date Fri, 9 Nov 2012 03:36:08 +0100

```Dear Statalist,

I expect a non-linear effect of an exogenous variable, x1, on a
dependent variable, y. The variable x1 is affected by another
exogenous variable, x2. The variable x2 affects x1 directly and also y
directly. The variable x1 does not affect x2. I am only interested in
the partial effect of x1 on y while controlling for x2 --- or at least
while controlling for the part of the variation in x2 that affects y
directly.

I have the following regression equation:

(1)   y = b1*x1 + b2*(x1)^2 + b3*x2 + constant

Although I get the expected estimates of b1 and b2, they are
insignificant. They are, however, significant if I exclude x2. I
believe this is the result of collinearity between x1 and x2 because
x1 is affected by x2. I have tried to resolve the problem by first
running the regression

(2)   x2 = x1 + constant

and then generating the variable x2_res consisting of the residuals
from regression (2). I have then modified regression model (1) by
substituting x2 with x2_res, i.e. I then estimate the model:

(3)   y = b1*x1 + b2*(x1)^2 + b3*x2_res + constant

The coefficients b1 and b2 are now significant. This is also the case
if I used an n>2 degree polynomial in x1 in model (2).

My thinking is that controlling for x2_res corresponds to controlling
for the part of the variation of x2 that is not affecting x1.

Does this make sense?

In order not to flood the list, I would like to thank you very much in