Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: mi of time variant measures in longitudinal data using mvn and chained approach

 From "Jacqueline Jodl" To Subject st: mi of time variant measures in longitudinal data using mvn and chained approach Date Mon, 29 Jul 2013 09:56:21 -0400

```Dear Statalist,

I am struggling so with imputing my longitudinal dataset.

My fundamental problem is not time INVARIANT measures; it is time variant
measures.

A great example of a successfully imputed time INVARIANT measure are the
cognitive test score in middle school, PIATmath and PIATread. I have 4188
respondents; I am missing between 500 and 600 for each score. Mvn approach
imputed them with no issues.

mombirthage birthorder,

Performing EM optimization:
note: 496 observations omitted from EM estimation because of all imputation
variables missing
observed log likelihood =  -21249.41 at iteration 5

Performing MCMC data augmentation ...
Multivariate imputation                     Imputations =        5
Multivariate normal regression                    added =        5
Imputed: m=1 through m=5                        updated =        0

Prior: uniform                               Iterations =      500
burn-in =      100
between =      100
------------------------------------------------------------------
|               Observations per m
|----------------------------------------------
Variable |   Complete   Incomplete   Imputed |     Total
-------------------+-----------------------------------+----------
PIATmath |       3674          514       514 |      4188
PIATread |       3618          570       570 |      4188
------------------------------------------------------------------
(complete + incomplete = total; imputed is the minimum across m
of the number of filled-in observations.)
.
end of do-file

A great example of a time invariant measure that I have yet to successfully
impute is the TIPI score (personality measure) for extraversion. My data is
in wide form.  (As stated above, I have a total of 4188 respondents.)

This measure was included in three waves of my nine waves of data, 2006,
2008 and 2010.  The question is ONLY asked of those respondents who are over
age 19.

In 2006 I have 3646 observations with no missing data (the difference
between 4188 and 3646 represents those were valid skips). In 2008 the
question was asked again but only of those who were missed in 2006; in 2008
there are 194 observations with 7 missing data observations (none of these
respondents overlap with 2006).  In 2010 I have 3026 observations (for most
of these respondents this is their second observation) with 44 missing data
observations.

I checked tabs/codebook before and after recoding for soft and hard missing
data to make sure the recoding syntax worked.

Again, my data is in wide format.  I used both mvn and chained approach.

THIS IS THE ERROR I RECEIVED FOR mvn:

. mi impute mvn tipiextra_2008  tipicrit_2008  tipiselfdis_2008
tipianx_2008  tipiopen_2008  tipi
> res_2008  tipisym_2008  tipidisorgan_2008  tipicalm_2008  tipiconv_2008
///
> tipiextra_2010  tipicrit_2010  tipiselfdis_2010  tipianx_2010
tipiopen_2010  tipires_2010  tipi
> sym_2010  tipidisorgan_2010  tipicalm_2010  tipiconv_2010 ///
> = tipiextra_2006  tipicrit_2006  tipiselfdis_2006  tipianx_2006
tipiopen_2006  tipires_2006  ti
> pisym_2006  tipidisorgan_2006  tipicalm_2006  tipiconv_2006 TANF highgrade
note: variables tipianx_2010 tipires_2010 tipicalm_2010 contain no soft
missing (.) values;
imputing nothing
no observations
stata():  3598  Stata returned error
_Mis_Est::init():     -  function returned error
_DA_Norm::init():     -  function returned error
<istmt>:     -  function returned error
r(3598);
end of do-file

THIS IS THE ERROR I RECEIVED WITH CHAINED:

Performing chained iterations ...
tipianx_2010: missing imputed values produced
This may occur when imputation variables are used as independent
variables or when
independent variables contain missing values.  You can specify option
force if you wish to
proceed anyway.
r(498);
end of do-file

I CHECKED AND RECHECKED TO MAKE SURE NONE OF MY INDEPENDENT VARIABLES
CONTAIN MISSING VALUES.

I THINK THE PROBLEM IS THAT MI WANTS TO IMPUTE TO 4188 OBSERVATIONS FOR TIME
VARIANT OBSERVATIONS BECAUSE MY SAMPLE SIZE CONTAINS 4188 RESPONDENTS.  SO
4188 IS THE NUMBER TO USE FOR TIME INVARIANT OBSERVATIONS, BUT FOR TIME
VARIANT OBSERVATIONS, IT VARYS BY WAVE, BY MEASURE.

Any advice would be greatly appreciated.

A completely despondent doctoral student,
Jackie Jodl

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```