Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: the impute command

From   Richard Williams <>
Subject   RE: st: RE: the impute command
Date   Wed, 19 May 2004 14:59:24 -0500

At 06:12 PM 5/19/2004 +0100, Nick Cox wrote:
Agreed. The "here" was carrying a lot of weight
in my statement. In general, -logit- is surely
better than -regress- for predicting 0-1 variables;
in that context -round()- should be unproblematic.
My theory is that, after recoding to 0/1, the logit and regress approaches would produce virtually identical results, with the main differences occurring when the predicted probabilities were very close to .5. I can't prove this, mind you, but I did try a quick simulation of 1000 cases with a 100 missing values on y and the 0/1 predictions were the same in 99 of the 100 cases. If I had a lot of Xs with a lot of scattered missing data, my guess is that impute would be far easier to use and would produce very similar results to doing it the "right" way.

Looking at impute.ado, it looks like it might be very easy to have it use -logit- instead of -regress-; except you'd have to add error checking to make sure the y var was a dichotomy, and it might take much longer to run. Not sure it would be worth it.

Of course, I'm still fairly skeptical as whether logit or regress should be used at all, as opposed to just doing listwise deletion.

Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
FAX: (574)288-4373
HOME: (574)289-5227
EMAIL: Richard.A.Williams.5@ND.Edu
WWW (personal):
WWW (department):

* For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index