# Re: st: multiply imputed values for outcome

 From "Austin Nichols" To statalist@hsphsun2.harvard.edu Subject Re: st: multiply imputed values for outcome Date Tue, 11 Jul 2006 09:45:40 -0400

```Maarten and Leslie--
I was *guessing* that Leslie was using test score data output from an
IRT model where five plausible values of an underlying ability
parameter (called theta, essentially like a fixed effect from a logit
estimated on panel data, but assumed to be normally distributed in the
population) are used to create five plausible values for number of
items correct on a complete test when each person is taking a version
of the test that contains a subset of the questions.  Perhaps this is
no different than multiple imputation in the case of missing data
(where all of the values of the outcome variable are imputed, for
every individual), but perhaps the (assumed) extra information about
the distribution of the values leads one to a different estimation
strategy--I would be happy to hear from better-informed list members
on this subject!
--Austin

On 7/11/06, Maarten buis <maartenbuis@yahoo.co.uk> wrote:
```
```--- "Leslie R Hinkson asked:
> I have a data set that has multiply imputed values (5) for the
> outcome variable. I have previously used HLM software to conduct my
> analysis but I was told that with the new GLM features in Stata 9
> that it should be possible to do the same in Stata. Unfortunately, I
> haven't found that way yet. Any thoughts?

> If you used HLM, you may want -xtmixed- or -gllamm- (-ssc install
> gllamm-) but I don't know about the "five plausible values".

The trick with multiple imputation is that you multiple plausible
values for each missing value, thus creating multiple "complete"
datasets. Next you estimate your model just as if you had a real
complete dataset on each "complete" dataset. In your case you would
than have five sets of estimates. The final point estimates are the
means over these five sets of estimates and the final standard error is
made with two components: the mean variance (squared standard error) of
each estimate and the variance between sets. The formulas can be found
on: http://www.stat.psu.edu/~jls/mifaq.html#howto . J. B. Carlin, N.
Li, P. Greenwood, & C. Coffey have writen tools for analyzing multiple
imputed datasets, that implement these formulas  (-findit mifit-), but
I don't think they include -xtmixed- or -gllamm-. However, these
formulas are pretty simple and can be implemented by hand if needs be.

--- "Leslie R Hinkson also asked:
> Also, is it possible to conduct standard linear regression analysis
> with multiple plausible values for the dependent variable using
> Stata 9?

-mifit- can handle standard linear regression analysis.

HTH,
Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

Buitenveldertselaan 3 (Metropolitan), room Z214

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

___________________________________________________________
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

```
```*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```