Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: collinearity in categorical variables

From   "JVerkuilen (Gmail)" <>
Subject   Re: st: collinearity in categorical variables
Date   Thu, 25 Apr 2013 14:29:15 -0400

On Thu, Apr 25, 2013 at 2:17 PM, Mitchell F. Berman <> wrote:
> Stata Users:
> We are working on a logistic regression model with both continuous and
> categorical independent variables.
> I'm familiar with collin to generate VIF and condition index.  But my
> impression and information on the internet suggests that collin is not
> appropriate for categorical variables.
> What would people use to evaluate collinearity (probably not the correct
> term in this case) for categorical variables.

I'm not 100% sure what you mean. Are you checking for linear
dependence issues among right hand side variables? If so, properly
coded categorical variables (e.g., dummies) are certainly reasonable
to check with collinearity diagnostics; nothing in the math precludes
it. You can have two problems with logistic regression, separation and
collinearity. Stata detects perfect separation and throws an error,
but near separation can be really awful to track down. It is usually
seen by having ludicrously large standard errors on a regression

One common trick for collinearity is to use regress for the same model
and then compute the VIFs, which are functions of the X variables and
thus the fact that regress isn't the model you want to run doesn't
matter. It's not quite perfect but it's a reasonable start.

The diagnostic information available after logit is quite nice.

JVVerkuilen, PhD

“He uses statistics as a drunken man uses lamp-posts – for support
rather than illumination.”--Andrew Lang

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index