Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: strange and differing results for mi vs. ice mlogit

From   Maarten buis <>
Subject   Re: st: strange and differing results for mi vs. ice mlogit
Date   Mon, 18 Oct 2010 16:32:07 +0100 (BST)

--- On Mon, 18/10/10, Mary E. Mackesy-Amiti wrote:
> information, add an "unknown" category to the occupation variable.

I guess that part of your message got "eaten by the monster 
that lives on the statalist server and eats the first line of every
statalist post".

I interpret your partial message as follows: Why not avoid multiple
imputation and add an extra category "unknown occupation" instead.
This is a very intuitive, but unfortuantly often also a very wrong 

Consider the following example: We are interested in the effect
of x on y while controling for occupation. We have two occupation
categories high, and low. We follow your suggestion and add a 
category unknown for those with missing values on occupation. 
Next we create two dummies, one for high occupation and one for
the unknowns (so the reference category is low). 

The following happens for complete observations:
y = b0 + b1*x + b2*high + b3*unknown
y = b0 + b1*x + b2*high + b3*0
y = b0 + b1*x + b2*high 

So b1 is the effect of x while controling for occupation.

The following hapens for observations with missing values on
y = b0 + b1*x + b2*high + b3*unknown
y = b0 + b1*x + b2*0 + b3*1
y = b0* + b1*x              (b0* = b0 + b3)

So b1 is now the effect of x while _not_ controling for 

To make things worse, in our model we constrain the two b1s to
be equal, so it becomes some sort of unknown mixture between the
effect of x while controling and not controling for occupation.
So now we made things worse, by adding this category.

There is one exception, this approach does make sense when a 
missing value is itself a substantially meaningfull value. For 
example, say our observations are women and the missing values 
are the homemakers. Mary's solution would in effect be 
equivalent to adding the unpaid "occupation" homemaker to our 
occupation variable, which in many instances would make perfect 

Hope this helps,

Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen


*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index