[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Multicollinearity test

From   Rosie Chen <>
Subject   Re: st: Multicollinearity test
Date   Fri, 5 Feb 2010 11:21:05 -0800 (PST)

Thanks, Maarten. The two situations you explained make sense to me. There are two other situations: (3) X1 and X2 are inter-related with each other, but there is no clear direction of the relationship. This means there is no clear theory to identify which factor causes which. (4) X1 and X2 are not related theoretically, although statistically correlated. 

My questions are: (1) How would we deal with the third situation? 
(2) The fourth situation should be a multicollinearity problem, and what shall we do if findings from correlation and VIF tests are not consistent? 
Welcome suggestions from others on this listserv too. Thanks,


----- Original Message ----
From: Maarten buis <>
Sent: Fri, February 5, 2010 1:34:52 PM
Subject: Re: st: Multicollinearity test

--- On Fri, 5/2/10, Rosie Chen wrote:
>     I have two models: one is to run the model
> with x1, x2, and x3 predictors, and the other is to take
> x3 out and run the model with the x1 and x2 only at the third
> level. In the first model, only x2 is statistically
> significant, but in the second model both x1 and x2 are
> significant after x3 is taken away. The second model's results
> make more sense than the results in the first model. I did a
> correlation test, and found that x3 highly correlated with x1
> (r coefficient >0.5 and p<0.01). But the VIF test of the first
> model (linear one-level model) does not show multicolinearity
> problem of the x3 variable (VIF value <2). 
>    My question is: should I use VIF test or the correlation
> test to identify the possible multicollinearity problem, if
> the two tests results are not consistent, as indicated above? 

Neither. The real issue is whether you believe that x3 is an 
intervening or a confounding variable. Consider a simplified
version of your model: a dependent variable y is influenced 
by two variables x1 and x2, and that you are mainly interested
in x1. 

x2 is an confounding variable when it also causes x1.
A classic example of that is that when one tries to explain
the birth rate in various areas with the number of storks in
these areas you will find a possitive effect if you don't 
control for the degree of urbanization (rural areas have both
more storks and higher birth rate than urban areas). The
degree of urbanization is thus a confounding variable and 
you will have to control for it regardles of whether that 
results in a multicolinearity problem or not. 

x2 is an intervening variable if it is caused by x1. A classic 
example (in sociology) is that parental status influences the 
education of the offspring, which in turn influences the 
offspring's status. If we want to know what the effect of 
parental status is on the status of the offspring, then we
would want to include this indirect effect through education. 
This is the part of the effect of parental status that we can
explain, so we are double sure that that effect really exists. 
If we control for education we would take away the part of the
effect that we understand and are left with only the unexplained
effect of parental status. So in case of intervening variables 
we do not want to control for the intervening variable (unless
you want to decompose the total effect into direct and indirect
effects). Moreover, you don't want to control for intervening
variables regardless of whether there is a problem with 
multicolinearity or not.

Hope this helps,

Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen


*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2020 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index