[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: Re: Missing values test

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	RE: st: RE: Re: Missing values test
Date	Sun, 2 Dec 2007 17:49:22 -0000

I am concerned with the structure of missingness, not how to fix it.  

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Richard
Williams
Sent: 02 December 2007 17:41
To: [email protected]; [email protected]
Subject: Re: st: RE: Re: Missing values test

At 12:11 PM 12/2/2007, Nick Cox wrote:
>I've not done it myself, and this may well be obvious to those who know

>the literature, but surely more can be said.
>
>Missingness can always be represented by a dummy. So the structure of 
>missing data can always be explored by logit regression with 
>missingness on something as response w.r.t. various predictors, which 
>may well include missingness on some other things as dummy predictors.

I believe Nick is talking about using the MD dummy as your dependent
variable.  In addition, there have been proposals about using MD dummies
as independent vars, which I'll now comment on since i have given
partially incorrect responses in the past!

Cohen and Cohen proposed several years ago that you plug in the mean for
missing data and then add a MD dummy variable indicator.  Allison
discusses this technique in his green Sage book, "Missing Data".

When data exist in reality but their value is unknown (e.g. because of
nonresponse), Allison calls this technique "remarkably simple and
intuitively appealing." But unfortunately, "the method generally
produces biased estimates of the coefficients."  He says that listwise
deletion is better.

HOWEVER, as Richard Campbell recently pointed out to me, buried in the
footnotes of Allison's book is the following:

"While the dummy variable adjustment method is clearly unacceptable when
data are truly missing, it may still be appropriate in cases where the
unobserved value simply does not exist.  For example, married
respondents may be asked to rate the quality of their marriage, but that
question has no meaning for unmarried respondents.  Suppose we assume
that there is one linear equation for married couples and another
equation for unmarried couples.  The married equation is identical to
the unmarried equation except that it has (a) a term corresponding to
the effect of marital quality on the dependent variable and b) a
different intercept.  It's easy to show that the dummy variable
adjustment method produces optimal estimates in this situation."

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Missing values test
  - From: "Constantin Colonescu" <[email protected]>
- st: Re: Missing values test
  - From: "Rodrigo A. Alfaro" <[email protected]>
- st: RE: Re: Missing values test
  - From: "Nick Cox" <[email protected]>
- Re: st: RE: Re: Missing values test
  - From: Richard Williams <[email protected]>

Prev by Date: RE: st: RE: Re: Missing values test
Next by Date: Re: st: RE: Re: Missing values test
Previous by thread: Re: st: RE: Re: Missing values test
Next by thread: Re: st: RE: Re: Missing values test
Index(es):
- Date
- Thread