Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: Detecting collinearity during regression analysis


From   "Martin Weiss" <martin.weiss1@gmx.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Re: Detecting collinearity during regression analysis
Date   Tue, 10 Feb 2009 20:43:27 +0100

<>

My understanding of collinearity is that if two covariates are highly correlated then one of them does not convey information over and above what the other one has already contributed to the regression. So your association of small point estimate with collinearity might be misleading... Watch out for the vifs:

*********
clear*
set obs 10000
*correlated covariates
corr2data x1 x2, corr(1, .1\ .1, 1) cstorage(full)
g y=1+2*x1+3*x2+rnormal()
reg y x1 x2
*small vifs
estat vif
*DGP with correlated covariates
clear*
set obs 10000
corr2data x1 x2, corr(1, .999\ .999, 1) cstorage(full)
g y=1+2*x1+3*x2+rnormal()
reg y x1 x2
*see the massive vifs
estat vif
**********

HTH
Martin
_______________________
----- Original Message ----- From: "Anon Mouse" <anon556656@live.ca>
To: <statalist@hsphsun2.harvard.edu>
Sent: Tuesday, February 10, 2009 8:28 PM
Subject: st: Detecting collinearity during regression analysis


Hello and thank you in advance,

I have a question about detecting collinearity.

First, see my example:





***BEGIN***

sysuse auto, clear
xi: qreg mpg foreign i.make, nolog

***END***






Note in this example that I used quantile regression to determine effects of foreign and make on MPG.

I used xi command to create dummy categorical variables for make (note that this creates quite a large number of variables for make, as make is a continuous variable, but I did this for example).

Note that the coefficients are very small (e.g. e^-15), approaching zero.

In this example, does this indicate collinearity?  And why?



The reason I used such a granular dummy categorization for make is to highlight my example. In my real data, I have age and wage. When I use categories such as age/10 or wage/10000, this gives me "collinearity" (i.e. very small coefficients). When I collapse these age or wage categories to smaller categories (i.e. age or wage as binomial variables, greater than or less than a certain value), I correct this problem of "collinearity".

Am I correct in my assumptions?

Thank you.

_________________________________________________________________
Windows Live Messenger. Multitasking at its finest.
http://www.microsoft.com/windows/windowslive/products/messenger.aspx
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index