 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: MI and z-standardisation

 From "J.B. Kirkbride" To statalist@hsphsun2.harvard.edu Subject st: MI and z-standardisation Date 14 Apr 2011 12:53:38 +0100

```Dear Stata Users

```
I would appreciate your guidance on the following topic regarding multiple imputation (MI) and z-standardisation. I am currently learning MI using the excellent stata help resources, but have an issue I can't find much support for.
```
```
I have a small dataset of 54 subjects, 4 of whom have missing data on a variable which measures social capital in their neighbourhood, let's call this variable "sc". It is a continuous variable with an approximate normal distribution. I wish to use this variable in the substantive analysis (eventually, a cox regression) as a predictor, using MI to estimate missing values. The best way to include this in such an analysis is as a z-standardised variable with a mean of 0 and sd of 1, to make parameter estimates more interpretable.
```
```
I have followed the MI commands and can obtain MI estimates for sc. My question is as follows:
```
```
I am unclear how/when/if to perform z-transformation on the multiply imputed data. I have considered two options:
```
```
1. Prior to MI, generate "zsc" using the "egen zsc=std(sc)" command and then run the appropriate MI commands, including "mi impute" on "zsc" to obtain direct estimates of the missing zsc values under an MI scenario.
```
```
2. Estimate missing values of "sc" using "mi impute" and then transform the variable after imputation using the command "mi passive: egen zsc=std(sc)". (An aside, I am assuming here that this is the correct way to specify "zsc", as it is a function of "sc"; your input would be welcome).
```
```
Either way, when I check the summary distribution of zsc for the Mth imputation ("mi xeq 0 1 20: summ zsc"), I do not quite get back the zsc variable with a mean of 0 & sd of 1, obviously, as the imputed values are just that, though the summaries for each imputative are reasonably close to this value (i.e. mean~-.03, sd~.99).
```
So my questions are really:

```
A. Can I still use the zsc variable in my substantive analysis and make the assumption it still has a mean of 0 / sd of 1?
```
B. Is either method (1 vs 2) preferable?

```
C. Is there another, preferable, way of achieving z-standardisation before/after MI?
```
```
D. Should I be using z-standardisations at all with MI? Many thanks in advance for your help with this matter. Best wishes James *
```*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```