# Re: st: Multiple imputation for longitudinal data

 From Stas Kolenikov To statalist@hsphsun2.harvard.edu Subject Re: st: Multiple imputation for longitudinal data Date Thu, 2 Dec 2010 17:39:53 -0600

```You have monotone missing data, and you would most likely be better
off utilizing the methods for monotone missing data rather than
bluntly rely on multiple imputation. Check Little and Rubin's book on
missing data, chapter 7 (in the 2nd edition).

On Thu, Dec 2, 2010 at 5:11 PM, Eduardo Nunez <enunezb@gmail.com> wrote:
> Dear Statalisters,
>
> I have Stata 11.1 (MP - Parallel Edition).
>
> I am interested in performing multiple imputation on a longitudinal
> data (on several variables with a percent of missing between 1-15%),
> were subjects are the cluster units with few observations in time.
> See below the data structure:
>
> xtdes, pattern(1000)
>
>     pid:  1, 2, ..., 1438                                   n =       1432
>   visit:  1, 2, ..., 12                                     T =         12
>           Delta(visit) = 1 unit
>           Span(visit)  = 12 periods
>           (pid*visit uniquely identifies each observation)
>
> Distribution of T_i:   min      5%     25%       50%       75%     95%     max
>                         1       1       1         2         3       6      12
>
>     Freq.  Percent    Cum. |  Pattern
>  ---------------------------+--------------
>      650     45.39   45.39 |  1...........
>      359     25.07   70.46 |  11..........
>      202     14.11   84.57 |  111.........
>       91      6.35   90.92 |  1111........
>       52      3.63   94.55 |  11111.......
>       44      3.07   97.63 |  111111......
>       11      0.77   98.39 |  1111111.....
>        9      0.63   99.02 |  11111111....
>        6      0.42   99.44 |  111111111...
>        4      0.28   99.72 |  1111111111..
>        3      0.21   99.93 |  11111111111.
>        1      0.07  100.00 |  111111111111
>  ---------------------------+--------------
>     1432    100.00         |  XXXXXXXXXXXX
>
> The article included in Stata FAQ ("How can I account for clustering
> when creating imputations with mi impute?") suggested using a
> "multivariate
> normal model to impute all clusters simultaneously" or strategy 3,
> although mentioned that is best suited to balanced repeated-measures
> data.
>
> Clearly, my data is not balanced. Moreover, the percent of data
> missing increased as patient follow-up gets far from baseline.
>
> Is there any other method suited for this type of longitudinal data?
> If not, how stringent is the limitation of not being balanced.
>
> Please, any help is welcome!
>
>
> Eduardo
