[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Collinearity in svy

From   Richard Williams <>
Subject   Re: st: Collinearity in svy
Date   Fri, 02 May 2008 11:13:09 -0500

At 08:21 AM 5/2/2008, Simon, Alan (CDC/CCHIS/NCHS) wrote:
The website essentially suggests using each variable as a dependent
variable in a separate regression  using all other variables as
independent variables, and then using the following command:

display "tolerance = " 1-e(r2) " VIF = " 1/(1-e(r2))

to calculate the Variance inflation factor.  However, this only seems to
work if the dependent variable is continous and the regression is OLS.

Is there a way to measure the variance inflation factor for categorical
variables in a complex survey design?  Or is there a better way to
approach this problem?
Personally, I see no problem with that. Multicollinearity is a problem with the right hand side of the model, i.e. the Xs. It doesn't matter whether Y itself will be analyzed via ols regression, logistic regression, or whatever. For example, in a non-svy setting, if y was a dichotomy that you will be analyzing via logistic regression, it is nonetheless fine to do something like

regress y x1 x2 x3

You are not interested in the coefficients from the regression, you are just interested in the collinearity diagnostics from vif.

One caveat: I am not sure if the use of svy somehow invalidates or complicates the usual collinearity diagnostics. But at the same time, it is not like these diagnostics have to be accurate down to 12 decimal places. You usually just want to get a ballpark estimate of whether or not collinearity is a problem in your data.

Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME: (574)289-5227
EMAIL: Richard.A.Williams.5@ND.Edu

* For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index