Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: RE: mean, mode or median for missing values


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: mean, mode or median for missing values
Date   Wed, 3 Feb 2010 17:48:46 -0000

There is another take, orthogonal to several good points. 

You could impute in all sorts of different ways and then compare the
results with each other and with those from the incomplete data. 

The ideal outcome is clearly that you might reach the same conclusion,
so none of this matters. I can't tell you what to conclude if you get
very different results except that the problem of missing data inhibits
firm conclusions. 

For "likert" read "Likert", passim. 

Nick 
n.j.cox@durham.ac.uk 

Verkuilen, Jay

>My variable is measured by likert scales, and there are a few
missing values. I am thinking of using mean, mode or median to
substitute the missing values. Which is better given the ordinal
nature of the measurement?<

None of the above. Univariate location imputations (mean, median or
mode) seriously distorts relationships with the rest of the dataset and
also pushes the estimate of scale downwards. If you have ordinal data
like that, it is often the case that using MI and simply treating your
discrete data as interval works well enough (you may need to do some
transformation, e.g., logging or square rooting counts). If you need
things to be back in the discrete scaling, simply round off fractions to
the nearest integer and/or truncate to push things back into the sample
space. It's not perfect but it's a lot better than univariate mean
imputation. 


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index