Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: missing data for LCA, cross-sectional complex survey data

From	"Cabrera, Peter" <[email protected]>
To	"[email protected]" <[email protected]>
Subject	st: missing data for LCA, cross-sectional complex survey data
Date	Wed, 7 Jul 2010 04:40:27 +0000

Hello,
 
First time poster here. I am relatively new to Stata and I am still learning about various approaches to handling missing data. My questions pertain mostly to statistics and not coding.   I am using Stata version 11.1.
 
I performed a "baseline" exploratory LCA using Mplus version 5.2. Of course, Mplus handled the missing data on the latent class indicators using FIML, and I obtained a 3 class solution. However, there was substantial missingness on the polytymous covariates I wanted to include in the model (ranging from 5% to 20%, MAR).  Following the guidelines in Applied Survey Data Analysis by Heeringa, West & Berglund, I was able to use Royston's MI ICE command in Stata 11.1 to impute the missing data. I used all of the analysis variables as well as auxiliary variables as recommended by Heeringa et al., the UCLA Statistical computing website, and almost every other source I encountered including the Stata Journal, etc (I have also ordered 2 texts on Amazon devoted entirely to handling missing data and I am eagerly awaiting their arrival...if anyone can recommend something akin to the Complete Idiot's Guide to Missing Data, I would be forevever in your debt). Next, I imported the imputed !
 data sets (M=10) into Mplus.  I then ran the baseline Latent Class model again using the imputed data, but this time I obtained a 3 class solution with wildly different proportions for each latent class. 
 
I have since updated to Mplus version 6, and I am receiving virtually the same baseline 3 class solution as I obtained using MI ICE in Stata 11.1. If I grok what I have been reading, MI is generally superior to FIML approaches to handling missing data. But I still shouldn't be obtaining such vastly different results for the baseline LC model, no?.  My questions are:
 
(1) Given the differences in the baseline LC solutions when using FIML vs MI, is it safe to assume that I must have seriously screwed up when I specified my missing data models?
(2) Is there an alternate universe where it might be acceptable to only use MI for the covariates in the LC model while using FIML to handle missingness on the LC indicators (I suspect the answer is a resounding "no" and that by even posing such a question I deserve nothing but scorn and derision....As a non statistician, I figured it couldn't hurt to ask, however.  Please be gentle).
 
Thank you for your consideration. 
 
Sincerely,

P. Cabrera
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: R: missing data for LCA, cross-sectional complex survey data
  - From: "Carlo Lazzaro" <[email protected]>

Prev by Date: Re: st: overidentification test after cmp
Next by Date: st: R: missing data for LCA, cross-sectional complex survey data
Previous by thread: st: imposing parametric constraints in a two-stage procedure
Next by thread: st: R: missing data for LCA, cross-sectional complex survey data
Index(es):
- Date
- Thread