Thanks for the references, Rodrigo. The easiest way to install -mim- might be to type, in Stata:

net from
net install mira

As Brian your message did not capture the attention of the users. As Steve pointed you have to work with multiple datasets generated by some multiple imputation procedure. I changed the subject to capture the attention of more users... many of them are not related with education but they are with the statistical procedures that you need.

About your question: Rubin (1987) provides simple rules for combining the results on different dataset. There is a very good information on those formulas and more at ~jls/mifaq.html. You will note that it is "good" to know the procedure that generated your datasets. Schafer (1997) suggests consistency between the imputation-model and the analysis-model. For example, if in the imputation-model you controlled by family characteristic but in your analysis you could not omit those vars.

About code: The -mim- command is simple to use, but I think that requires to have all the data in one dataset. For huge datasets it is better to run the procedures in each dataset and combine the results after. I wrote a small code to run any kind of regression in that way, you need and You will note that it is easy to implement the Rubin's rules. You could get some of them by using loops over the dataset!!

I hope this helps you and other members could provide you more information on the topic.

Rubin, D.B. (1987) Multiple Imputation for Nonresponse in Surveys. J. Wiley & Sons, New York.
Schafer, J.L. (1997) Analysis of Incomplete Multivariate Data. Chapman & Hall, London.

I'm not familiar with NAEP, but a search on 'imputation' may lead you to a Stata solution. The -mim- command prefix looks most promising.

I would be interested in hearing from any list member who has used
Stata for NAEP student-level analyses.

The NAEP student data (National Assessment of Educational Progress),
are collected during a large national assessment conducted about every
two years in the U.S. NAEP items are constructed using IRT methods,
and the item selection is based on a matrix sample. The NAEP data
contain five plausible values for each response.

I found an earlier request from Brian Jacobs
statalist/archive/2003-06/msg00505.html but no substantive replies to
his post. I am hoping that the increased availability of NAEP data
since 2003 may be associated with an increase in the number of
researchers using Stata to analyze these data.

Many thanks.


