[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Re: Missing values test

From   Maarten buis <[email protected]>
To   [email protected]
Subject   Re: st: RE: Re: Missing values test
Date   Sun, 2 Dec 2007 17:58:44 +0000 (GMT)

--- Maarten buis wrote:
> > Often three types of missing data are distinguished in this
> > literature: Missing Completely At Random (MCAR), Missing At Random
> > (MAR), and Not Missing At Random (NMAR). Multiple Imputation is
> > based on the MAR assumption.
> >
> > MCAR assumes that every individual has the probability of getting a
> > missing value, i.e. the probability of missingness is not
> > influenced  by any variable. This assumption can be investigated
> > for the observed data, in a way suggested by Nick. If you have MCAR
> > or if you can show that the probability of missingness does not
> > depend on your dependent variable, than the safe thing to do is
> > use the observed cases, as those will give unbiased estimates with
> > correct inference.
> >
> > MAR assumes that the probability of missingness may differ from
> > person to person, but these differences are only caused by observed
> > variables. In order to show that the MAR holds you need to show
> > that the unobserved values of the missing variables do not
> > influence the probability of missingess, which is self-defeating:
> > if you had those unobserved values those values wouldn't be
> > missing. So this assumption is fundamentally untestable.
> >
> > NMAR assumes that the probability of missingness is influenced by
> > both observed and unobserved information. For instance say that
> > persons with a very high or very low income are less inclined to
> > reveal their income in a questionair.

--- David Airey <[email protected]> wrote:
> I have trouble understanding the translation of these three missing  
> situations into when it is useful to impute.

True, that is hard. 

If you have not many missing values on your dependend variable
(explained variable, or y), than you can create a dummy for missingness
on the independend variables (explanatory variables, or x-s) and see of
that dummy is related to the y. If this is not the case than no
imputation is needed, you will get correct estimates and inference if
you use only the fully observed observations.

If this is not the case, than you are dependent on theory alone. If you
think, and can convince your readers, that the probability of
missingness depends only on your variables that do not contain missing
values than you can do Multiple Imputation (you are convinced that the
MAR assumption is satisfied). Notice that this list of variables
without missing values is likely differ across individuals in your
data, making this a pretty weird assumption.

If none of these situations apply, than you are in trouble.

-- Maarten

Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

Sent from Yahoo! - the World's favourite mail

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index