# Re: st: R: Imputation vs substitution with mean

 From James Bernard <[email protected]> To [email protected] Subject Re: st: R: Imputation vs substitution with mean Date Fri, 18 Oct 2013 16:04:41 +0800

Thanks Carlo,

James

> Dear James,
> sample mean substitution is a very ill-advised approach (even if you have
> few missing values).
> In the best case, replacing missing values with the mean of the existing
> observation will reduce the variance, making all the subsequent statistics
> biased.
> Also last observation carried forward (LOCF) belongs to the "don't do it"
> group.
> You can find wide coverage of this topic in:
> - Roderick J. A. Little, Donald B. Rubin. Statistical Analysis with Missing
> Data, 2nd Edition. Wiley Series in Probability and Statistics, 2002.
> - John W. Graham. Missing Data: Analysis and Design. Springer, 2012.
> There are other textbooks on dealing with missing data; a Google search can
> give you a comprehensive picture of what has been published so far.
> Multiple imputation is the way to go; I would point you to -[MI] intro
> substantive - in Stata .pdf manual (as usual, a very good first step).
>
> I do hope this helps.
> Best regards,
> Carlo
> Hi All,
>
> I have a longitudinal dataset with a dependent variable of count type.
> Mu independent variable are of continuous and categorical type.
>
> My independent variables have many missing values. I want to substitute the
> missing values.
>
> One way is to go through what I found out to be "imputation".
>
> I heard that an easier approach is substituting the values with group
> mean/population mean. Is this very different form imputation done by
> -mi-command in Stata?
>
>
> James
