Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Averages for missing values

From   "Richard. Williams" <>
Subject   Re: st: Averages for missing values
Date   Thu, 08 Sep 2005 18:32:07 -0500

At 06:01 PM 9/8/2005, SamL wrote:
These methods are frowned upon?  Which method in particular is the object
of the frown?  I guess I am puzzled, because mean substitution comes in a
few flavors.  There is substituting the mean and leaving it at that.  I
assume this is widely understood as problematic.  And then there is
substituting the mean and adding a dummy variable to the model that
indicates whether the mean has been substituted.  Is there any problem
with this second approach?
I taught and recommended the latter method for several years - and then retracted my support a few years ago! See Allison's book on Missing Data. For the Cliff's Notes version, see pp. 5-6 of

Allison gives some great examples showing that the procedure gives biased estimates and is worse than listwise deletion. In fact, in general he shows that most of the traditional methods for handling missing data are worse than listwise deletion. You should either use listwise deletion, or go with a more advanced method like multiple imputation.

I really need to update that handout some time, because it is full of advice on how to implement methods that you probably shouldn't implement! Part of the problem though is that the better methods can be problematic or hard to implement too.

Richard Williams, Associate Professor
OFFICE: (574)631-6668, (574)631-6463
FAX: (574)288-4373
HOME: (574)289-5227
EMAIL: Richard.A.Williams.5@ND.Edu
WWW (personal):
WWW (department):
* For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index