# Re: st: Wald Chi-Square in Logistic with Cluster Option

 From Richard Williams To statalist@hsphsun2.harvard.edu Subject Re: st: Wald Chi-Square in Logistic with Cluster Option Date Sat, 11 Mar 2006 19:06:13 -0500

```At 05:25 PM 3/11/2006, you wrote:
```
```The good news is that, assuming your logistic model specifications are
correct, then your Wald value is OK. It may be that some of your variables
are highly collinear with each other, and it's that that's pushing it up a
few notches: you can check this with Richard Williams' highly useful
```
Thanks to Clive for the kind words. Alas, much as I'd like to claim credit for -collin- (along with xtabond2 and several other programs!) the actual author is Phil Ender and you need to get it from UCLA, not SSC. Just use -findit collin- to get a copy.

```The bad news is that comparing two logistic regression models, even if
they both have some independent variables in common, is _wrong_. For the
full reasoning, you can check out a neat .pdf file from that man again
Williams at

http://www.nd.edu/%7Erwilliam/xsoc694/x04.pdf
```
Not quite. The problem comes in comparing coefficients across models, e.g. you have x1, x2 and x3 in a model, you then add x4, x5 and x6, and you observe that the coefficients for x1, x2 and x3 are quite a bit different in the two models. This is a fairly common thing to do with OLS regression models, but, for reasons explained in the handout, can be highly deceptive when doing things like logistic regression. But, that doesn't mean that you can't run a series of models, and see whether adding or deleting variables significantly affects the fit of the model.

In the case of the original problem, I am not sure what is going on. The behavior seems bizarre to me; you add a variable, and the chi-square plummets by 80,000??? You add a different variable, and it plummets by over 111,000? I suspect it has something to do with the use of clustering, but I really don't know. Besides -collin-, I might do some simple descriptive stats, e.g. crosstab Y with some of the Xs. 23 Xs is a lot; perhaps the data are being spread too thin. Maybe add variables in small groups and see if there is some point at which the chi-square goes wild, and then see if there is something odd about the variable that causes it. I'd also probably cheat and try running it without the cluster option, and see if that produces more sensible results; I believe that would suggest that clustering was somehow part of the problem.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
FAX: (574)288-4373
HOME: (574)289-5227
EMAIL: Richard.A.Williams.5@ND.Edu
WWW (personal): http://www.nd.edu/~rwilliam
WWW (department): http://www.nd.edu/~soc
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/