Re: st: collinearity in categorical variables

Thu, 25 Apr 2013 14:29:15 -0400

On Thu, Apr 25, 2013 at 2:17 PM, Mitchell F. Berman <mfb1@columbia.edu> wrote: > Stata Users: > > We are working on a logistic regression model with both continuous and > categorical independent variables. > > I'm familiar with collin to generate VIF and condition index. But my > impression and information on the internet suggests that collin is not > appropriate for categorical variables. > > What would people use to evaluate collinearity (probably not the correct > term in this case) for categorical variables. I'm not 100% sure what you mean. Are you checking for linear dependence issues among right hand side variables? If so, properly coded categorical variables (e.g., dummies) are certainly reasonable to check with collinearity diagnostics; nothing in the math precludes it. You can have two problems with logistic regression, separation and collinearity. Stata detects perfect separation and throws an error, but near separation can be really awful to track down. It is usually seen by having ludicrously large standard errors on a regression coefficient. One common trick for collinearity is to use regress for the same model and then compute the VIFs, which are functions of the X variables and thus the fact that regress isn't the model you want to run doesn't matter. It's not quite perfect but it's a reasonable start. The diagnostic information available after logit is quite nice. Jay -- JVVerkuilen, PhD jvverkuilen@gmail.com “He uses statistics as a drunken man uses lamp-posts – for support rather than illumination.”--Andrew Lang * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

