[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Re: Missing values test

From   David Airey <david.airey@Vanderbilt.Edu>
Subject   Re: st: RE: Re: Missing values test
Date   Sun, 2 Dec 2007 11:39:57 -0600


I have trouble understanding the translation of these three missing situations into when it is useful to impute.


On Dec 2, 2007, at 11:35 AM, Maarten buis wrote:

--- Nick Cox <> wrote:
Missingness can always be represented by a dummy. So
the structure of missing data can always be explored by
logit regression with missingness on something as response
w.r.t. various predictors, which may well include missingness
on some other things as dummy predictors.
The problem here is that now you are talking about what is known in the
missing data literature as the Missing Completely At Random (MCAR)
assumption. Often three types of missing data are distinguished in this
literature: Missing Completely At Random (MCAR), Missing At Random
(MAR), and Not Missing At Random (NMAR). Multiple Imputation is based
on the MAR assumption.

MCAR assumes that every individual has the probability of getting a
missing value, i.e. the probability of missingness is not influenced by
any variable. This assumption can be investigated for the observed
data, in a way suggested by Nick. If you have MCAR or if you can show
that the probability of missingness does not depend on your dependent
variable, than the safe thing to do is just use the observed cases, as
those will give unbiased estimates with correct inference.

MAR assumes that the probability of missingness may differ from person
to person, but these differences are only caused by observed variables.
In order to show that the MAR holds you need to show that the
unobserved values of the missing variables do not influence the
probability of missingess, which is self-defeating: if you had those
unobserved values those values wouldn't be missing. So this assumption
is fundamentally untestable.

NMAR assumes that the probability of missingness is influenced by both
observed and unobserved information. For instance say that persons with
a very high or very low income are less inclined to reveal their income
in a questionair.

-- Maarten

Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

Sent from Yahoo! - the World's favourite mail

* For searches and help try:
David C. Airey, Ph.D.
Pharmacology Research Assistant Professor
Center for Human Genetics Research Member

Department of Pharmacology
School of Medicine
Vanderbilt University
Rm 8158A Bldg MR3
465 21st Avenue South
Nashville, TN 37232-8548

TEL   (615) 936-1510
FAX   (615) 936-3747

*   For searches and help try:

© Copyright 1996–2023 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index