Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Wald Chi-Square in Logistic with Cluster Option

From   Richard Williams <>
Subject   Re: st: Wald Chi-Square in Logistic with Cluster Option
Date   Sat, 11 Mar 2006 19:06:13 -0500

At 05:25 PM 3/11/2006, you wrote:
The good news is that, assuming your logistic model specifications are
correct, then your Wald value is OK. It may be that some of your variables
are highly collinear with each other, and it's that that's pushing it up a
few notches: you can check this with Richard Williams' highly useful
-collin- post-estimation command, downloadable from SSC.
Thanks to Clive for the kind words. Alas, much as I'd like to claim credit for -collin- (along with xtabond2 and several other programs!) the actual author is Phil Ender and you need to get it from UCLA, not SSC. Just use -findit collin- to get a copy.

The bad news is that comparing two logistic regression models, even if
they both have some independent variables in common, is _wrong_. For the
full reasoning, you can check out a neat .pdf file from that man again
Williams at
Not quite. The problem comes in comparing coefficients across models, e.g. you have x1, x2 and x3 in a model, you then add x4, x5 and x6, and you observe that the coefficients for x1, x2 and x3 are quite a bit different in the two models. This is a fairly common thing to do with OLS regression models, but, for reasons explained in the handout, can be highly deceptive when doing things like logistic regression. But, that doesn't mean that you can't run a series of models, and see whether adding or deleting variables significantly affects the fit of the model.

In the case of the original problem, I am not sure what is going on. The behavior seems bizarre to me; you add a variable, and the chi-square plummets by 80,000??? You add a different variable, and it plummets by over 111,000? I suspect it has something to do with the use of clustering, but I really don't know. Besides -collin-, I might do some simple descriptive stats, e.g. crosstab Y with some of the Xs. 23 Xs is a lot; perhaps the data are being spread too thin. Maybe add variables in small groups and see if there is some point at which the chi-square goes wild, and then see if there is something odd about the variable that causes it. I'd also probably cheat and try running it without the cluster option, and see if that produces more sensible results; I believe that would suggest that clustering was somehow part of the problem.

Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
FAX: (574)288-4373
HOME: (574)289-5227
EMAIL: Richard.A.Williams.5@ND.Edu
WWW (personal):
WWW (department):
* For searches and help try:

© Copyright 1996–2020 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index