Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Jacqueline Jodl" <jmj2138@tc.columbia.edu> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: mi of time variant measures in longitudinal data using mvn and chained approach |

Date |
Mon, 29 Jul 2013 09:56:21 -0400 |

Dear Statalist, I am struggling so with imputing my longitudinal dataset. My fundamental problem is not time INVARIANT measures; it is time variant measures. A great example of a successfully imputed time INVARIANT measure are the cognitive test score in middle school, PIATmath and PIATread. I have 4188 respondents; I am missing between 500 and 600 for each score. Mvn approach imputed them with no issues. . mi impute mvn PIATmath PIATread = TANF highgrade highgrademom race male mombirthage birthorder, > add(5) Performing EM optimization: note: 496 observations omitted from EM estimation because of all imputation variables missing observed log likelihood = -21249.41 at iteration 5 Performing MCMC data augmentation ... Multivariate imputation Imputations = 5 Multivariate normal regression added = 5 Imputed: m=1 through m=5 updated = 0 Prior: uniform Iterations = 500 burn-in = 100 between = 100 ------------------------------------------------------------------ | Observations per m |---------------------------------------------- Variable | Complete Incomplete Imputed | Total -------------------+-----------------------------------+---------- PIATmath | 3674 514 514 | 4188 PIATread | 3618 570 570 | 4188 ------------------------------------------------------------------ (complete + incomplete = total; imputed is the minimum across m of the number of filled-in observations.) . end of do-file A great example of a time invariant measure that I have yet to successfully impute is the TIPI score (personality measure) for extraversion. My data is in wide form. (As stated above, I have a total of 4188 respondents.) This measure was included in three waves of my nine waves of data, 2006, 2008 and 2010. The question is ONLY asked of those respondents who are over age 19. In 2006 I have 3646 observations with no missing data (the difference between 4188 and 3646 represents those were valid skips). In 2008 the question was asked again but only of those who were missed in 2006; in 2008 there are 194 observations with 7 missing data observations (none of these respondents overlap with 2006). In 2010 I have 3026 observations (for most of these respondents this is their second observation) with 44 missing data observations. I checked tabs/codebook before and after recoding for soft and hard missing data to make sure the recoding syntax worked. Again, my data is in wide format. I used both mvn and chained approach. THIS IS THE ERROR I RECEIVED FOR mvn: . mi impute mvn tipiextra_2008 tipicrit_2008 tipiselfdis_2008 tipianx_2008 tipiopen_2008 tipi > res_2008 tipisym_2008 tipidisorgan_2008 tipicalm_2008 tipiconv_2008 /// > tipiextra_2010 tipicrit_2010 tipiselfdis_2010 tipianx_2010 tipiopen_2010 tipires_2010 tipi > sym_2010 tipidisorgan_2010 tipicalm_2010 tipiconv_2010 /// > = tipiextra_2006 tipicrit_2006 tipiselfdis_2006 tipianx_2006 tipiopen_2006 tipires_2006 ti > pisym_2006 tipidisorgan_2006 tipicalm_2006 tipiconv_2006 TANF highgrade highgrademom race mal > e mombirthage birthorder, add(5) note: variables tipianx_2010 tipires_2010 tipicalm_2010 contain no soft missing (.) values; imputing nothing no observations stata(): 3598 Stata returned error _Mis_Est::init(): - function returned error _DA_Norm::init(): - function returned error <istmt>: - function returned error r(3598); end of do-file THIS IS THE ERROR I RECEIVED WITH CHAINED: Performing chained iterations ... tipianx_2010: missing imputed values produced This may occur when imputation variables are used as independent variables or when independent variables contain missing values. You can specify option force if you wish to proceed anyway. r(498); end of do-file I CHECKED AND RECHECKED TO MAKE SURE NONE OF MY INDEPENDENT VARIABLES CONTAIN MISSING VALUES. I THINK THE PROBLEM IS THAT MI WANTS TO IMPUTE TO 4188 OBSERVATIONS FOR TIME VARIANT OBSERVATIONS BECAUSE MY SAMPLE SIZE CONTAINS 4188 RESPONDENTS. SO 4188 IS THE NUMBER TO USE FOR TIME INVARIANT OBSERVATIONS, BUT FOR TIME VARIANT OBSERVATIONS, IT VARYS BY WAVE, BY MEASURE. Any advice would be greatly appreciated. A completely despondent doctoral student, Jackie Jodl * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: creating summary variables with overlapping peer groups** - Next by Date:
**st: RE: loop until "0 real changes made"** - Previous by thread:
**st: creating summary variables with overlapping peer groups** - Next by thread:
**Re: st: mi of time variant measures in longitudinal data using mvn and chained approach** - Index(es):