Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: [renamed] Factor Analysis and Missing Data


From   Maarten buis <maartenbuis@yahoo.co.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: [renamed] Factor Analysis and Missing Data
Date   Tue, 2 Feb 2010 01:02:41 -0800 (PST)

> Jet wrote:
> > I have 20 items for factor analysis, but some items
> > have missing values. <snip> So, there are three 
> >options. <snip>
> >       1. Use the item mean of
> the nonmissing cases to substitute the
> > missing value, and then conduct the factor analysis,
> and calculate the
> > factor score.

I would not use that. The reason is discussed in this
post: <http://www.stata.com/statalist/archive/2007-12/msg00504.html>

> >       2. Use multiple
> > imputations to impute the value for each item
> > with missing values ( But what variables should be
> > used to imputation?> )

You should  at least use all the items in the imputation model, 
and do the factor analysis on the imputed data. If you have 
variables in your data that you believe to influence or are 
influenced by the latent factors then including those in the 
imputation model can also help. If the final aim of your 
analysis is to study the relationship between the latent factors
and other variables, then these other variables should also be
included in the imputation model.


> >       3. Use  the items without missing values to 
> > calculate factor score, and then impute the missing factor 
> > score( But what variables should be used for imputing the
> > factor score?).

I wouldn't do that, the idea of multiple imputation is to 
extrapolate patterns of association to the missing data, so I 
would stick close to the real observed variables in the 
imputation model.

--- On Tue, 2/2/10, Michael Norman Mitchell wrote:
>  <snip> consider Mplus, which can use
> the EM algorithm to obtain estimates on all observations
> without the need for imputation.

This is not a bad idea. The two should be close as they are
based on very similar ideas (see for instance, Schafer 1997),
but the problem with these missing data models is that it is 
very easy to make mistakes and hard to find them. So to have 
two models that should give the same results, can be useful 
to spot mistakes.

Hope this helps,
Maarten

J.L. Schafer (1997) Analysis of incomplete multivariate data.
Boca Raton: Chapman&Hall/CRC.

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------


      

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index