[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Re: Missing values test

From	"Rodrigo A. Alfaro" <[email protected]>
To	<[email protected]>
Subject	Re: st: RE: Re: Missing values test
Date	Mon, 3 Dec 2007 22:26:14 -0300

Following with this discussion and more.

I want to remark some of the reference here. Allison, P (2001) "Missing Data"
http://www.amazon.com/Missing-Quantitative-Applications-Social-Sciences/dp/0761916725. The book has a very good introduction for MAR, MCAR and so on. In particular the first 50 pages provide a general background for the topic.

Paul himself wrote a macro for SAS available in his webpage (http://www.ssc.upenn.edu/~allison/) that uses the EM/IP algorithm based on multivariate normal to generate multiple imputations. I do not have SAS, but I understand that the macro works using SAS/IML language which I realized is very similar to Mata in the case of Stata.

Now the questions: (1) Is there someone who traslated that SAS/IML code into Mata?, (2) Is there someone who is planning to do it? I am interested to contact that person because I need the code and I want to try the traslation.

Rodrigo.

----- Original Message ----- From: "Maarten buis" <[email protected]>
To: <[email protected]>
Sent: Sunday, December 02, 2007 3:23 PM
Subject: Re: st: RE: Re: Missing values test

--- Richard Williams <[email protected]> wrote:

Cohen and Cohen proposed several years ago that you plug in the mean
for missing data and then add a MD dummy variable indicator.  Allison
discusses this technique in his green Sage book, "Missing Data".
When data exist in reality but their value is unknown (e.g. because
of nonresponse), Allison calls this technique "remarkably simple and
intuitively appealing." But unfortunately, "the method generally
produces biased estimates of the coefficients."  He says that
listwise deletion is better.

The logic behind the advise against this dummy method is simple:

Say we have one explained variable (y) and two explanatory variables
(x1 and x2) and the following regression equation is correct:

y = b0 + b1 x1 + b2 x2 + e

Now assume some of the values of x2 are missing, and that we applied
this dummy method. So, we replace those missing values with the mean
(m2). Call this new varialbe x2'. We also add a dummy (D) which is 1
when x2 contained missing values. So we get the following regression
equation:

y = b0 + b1 x1 + b2 x2' + b3 D + e

This equation looks fine for those individuals without missing values:

y = b0 + b1 x1 + b2 x2 + b3 0 + e
 = b0 + b1 x1 + b2 x2 + e

But for those individuals with missing data the equation looks wrong:

y = b0 + b1 x1 + b2 m + b3 1 + e
 = b0' + b1 x1 + e
(whereby b0' = b0 + b2 m2 + b3 1)

So for these two sets of individuals different regression models are
estimated: one including a control for x2 and one not. Moreover,
notice that the effect of x1 is constrained to be the same in both
equations. So this effect is some mixture between the effect controlled
for x2 and not controlled for x2. This should in most cases worry you.

HOWEVER, as Richard Campbell recently pointed out to me, buried in
the footnotes of Allison's book is the following:

"While the dummy variable adjustment method is clearly unacceptable
when data are truly missing, it may still be appropriate in cases
where the unobserved value simply does not exist.  For example,
married respondents may be asked to rate the quality of their
marriage, but that question has no meaning for unmarried
respondents.  Suppose we assume that there is one linear equation for
married couples and another equation for unmarried couples.  The
married equation is identical to the unmarried equation except that
it has (a) a term corresponding to the effect of marital quality on
the dependent variable and b) a different intercept.  It's easy to
show that the dummy variable adjustment method produces optimal
estimates in this situation."

This footnote can be understood with the logic just laid out.

-- Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

__________________________________________________________
Sent from Yahoo! - the World's favourite mail http://uk.mail.yahoo.com

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: RE: Re: Missing values test
  - From: Maarten buis <[email protected]>

References:
- Re: st: RE: Re: Missing values test
  - From: Maarten buis <[email protected]>

Prev by Date: Re: st: keystroke for Mac OS X
Next by Date: Re: st: SAS
Previous by thread: Re: st: RE: Re: Missing values test
Next by thread: Re: st: RE: Re: Missing values test
Index(es):
- Date
- Thread