Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

Re: st: strange and differing results for mi vs. ice mlogit

 From Maarten buis To statalist@hsphsun2.harvard.edu Subject Re: st: strange and differing results for mi vs. ice mlogit Date Mon, 18 Oct 2010 16:32:07 +0100 (BST)

```--- On Mon, 18/10/10, Mary E. Mackesy-Amiti wrote:
> information, add an "unknown" category to the occupation variable.

I guess that part of your message got "eaten by the monster
that lives on the statalist server and eats the first line of every
statalist post".

I interpret your partial message as follows: Why not avoid multiple
imputation and add an extra category "unknown occupation" instead.
This is a very intuitive, but unfortuantly often also a very wrong
suggestions.

Consider the following example: We are interested in the effect
of x on y while controling for occupation. We have two occupation
categories high, and low. We follow your suggestion and add a
category unknown for those with missing values on occupation.
Next we create two dummies, one for high occupation and one for
the unknowns (so the reference category is low).

The following happens for complete observations:
y = b0 + b1*x + b2*high + b3*unknown
y = b0 + b1*x + b2*high + b3*0
y = b0 + b1*x + b2*high

So b1 is the effect of x while controling for occupation.

The following hapens for observations with missing values on
occupation:
y = b0 + b1*x + b2*high + b3*unknown
y = b0 + b1*x + b2*0 + b3*1
y = b0* + b1*x              (b0* = b0 + b3)

So b1 is now the effect of x while _not_ controling for
occupation.

To make things worse, in our model we constrain the two b1s to
be equal, so it becomes some sort of unknown mixture between the
effect of x while controling and not controling for occupation.
So now we made things worse, by adding this category.

There is one exception, this approach does make sense when a
missing value is itself a substantially meaningfull value. For
example, say our observations are women and the missing values
are the homemakers. Mary's solution would in effect be
equivalent to adding the unpaid "occupation" homemaker to our
occupation variable, which in many instances would make perfect
sense.

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```