Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Identical VIF values


From   Phil Clayton <philclayton@internode.on.net>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Identical VIF values
Date   Wed, 23 Feb 2011 17:35:07 +1100

The VIF is calculated by regressing each x variable on the other x variables:

VIF = 1/(1-R2) where R2 is the R-squared value for that x's regression on the other x variables

So when you only have 2 x variables you're regression x1 on x2 and x2 on x1. That will of course give you the same R-squared and hence the same VIF.

Phil

On 23/02/2011, at 4:12 PM, Dave Wilson wrote:

> Hi List,
> 
> I'm working with a sample dataset and attempting to bend the VIF function (or the alternative, collin) to my will. Here's the situation:
> 
> The two IVs, yearsdg and c_market, run just fine through the regression:
> 
>> . regress y yearsdg c_market
>> 
>> ...
>> 
>> ------------------------------------------------------------------------------
>>      y	 	|      Coef.   Std. Err.    t       P>|t|    [95% Conf. Interval]
>> -------------+----------------------------------------------------------------
>> yearsdg	 	|   979.4583   34.22053    28.62   0.000     912.2281    1046.689
>> c_market	|   39630.46   2131.883    18.59   0.000     35442.12    43818.79
>> _cons  		|   35905.22   611.2754    58.74   0.000      34704.3    37106.14
>> ------------------------------------------------------------------------------
> 
> I should now be able to run the "estat vif" command. It runs successfully, but, oddly, I get identical VIF and tolerance values:
> 
>> . estat vif
>> 
>>    Variable 	|       VIF       1/VIF  
>> ------------------+----------------------
>>     yearsdg	 |      1.01    0.993055
>>     c_market	 |      1.01    0.993055
>> -------------+----------------------
>>    Mean VIF 	|      1.01
> 
> This doesn't make any sense to me. the two variables aren't highly correlated:
> 
>> . pwcorr yearsdg c_market
>> 
>>             		|  yearsdg 		c_market
>> ------------------+------------------
>>     yearsdg 		|   1.0000 
>>     c_market 		|  -0.0833   1.0000 
> 
> And, further odd, if I add another variable to the regression model, the vif function runs just fine:
> 
>> . estat vif
>> 
>>    Variable 	|       VIF       1/VIF  
>> -----------------+----------------------
>>     yearsdg		|      2.51    0.398715
>>     c_market 		|     1.01    0.989573
>>     yearsrank		|      2.49    0.401212
>> -------------+----------------------
>>    Mean VIF |      2.00
> 
> This also reveals that the two-variable model causes yearsdg to "borrow" the vif and tolerance values from c_market (the values are really those of c_market). 
> 
> For final information, if I run the just-downloaded collin function, the results are the same:
> 
>> . collin yearsdg c_market
>> (obs=514)
>> 
>>  Collinearity Diagnostics
>> 
>>                        SQRT                   R-
>>  Variable      VIF     VIF    Tolerance    Squared
>> ----------------------------------------------------
>>   yearsdg      1.01    1.00    0.9931      0.0069
>>  c_market     	1.01    1.00    0.9931      0.0069
>> ----------------------------------------------------
>>  Mean VIF      1.01
> 
> What is going on? Is there a reason the VIF function doesn't work with just two variables?
> 
> Thanks in advance for the help!
> 
> ---
> Dave
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index