Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: MI and z-standardisation


From   "J.B. Kirkbride" <jbk25@cam.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   st: MI and z-standardisation
Date   14 Apr 2011 12:53:38 +0100

Dear Stata Users

I would appreciate your guidance on the following topic regarding multiple imputation (MI) and z-standardisation. I am currently learning MI using the excellent stata help resources, but have an issue I can't find much support for.
I have a small dataset of 54 subjects, 4 of whom have missing data on a 
variable which measures social capital in their neighbourhood, let's call 
this variable "sc". It is a continuous variable with an approximate normal 
distribution. I wish to use this variable in the substantive analysis 
(eventually, a cox regression) as a predictor, using MI to estimate missing 
values. The best way to include this in such an analysis is as a 
z-standardised variable with a mean of 0 and sd of 1, to make parameter 
estimates more interpretable.
I have followed the MI commands and can obtain MI estimates for sc. My 
question is as follows:
I am unclear how/when/if to perform z-transformation on the multiply 
imputed data. I have considered two options:
1. Prior to MI, generate "zsc" using the "egen zsc=std(sc)" command and 
then run the appropriate MI commands, including "mi impute" on "zsc" to 
obtain direct estimates of the missing zsc values under an MI scenario.
2. Estimate missing values of "sc" using "mi impute" and then transform the 
variable after imputation using the command "mi passive: egen zsc=std(sc)". 
(An aside, I am assuming here that this is the correct way to specify 
"zsc", as it is a function of "sc"; your input would be welcome).
Either way, when I check the summary distribution of zsc for the Mth 
imputation ("mi xeq 0 1 20: summ zsc"), I do not quite get back the zsc 
variable with a mean of 0 & sd of 1, obviously, as the imputed values are 
just that, though the summaries for each imputative are reasonably close to 
this value (i.e. mean~-.03, sd~.99).
So my questions are really:

A. Can I still use the zsc variable in my substantive analysis and make the assumption it still has a mean of 0 / sd of 1?
B. Is either method (1 vs 2) preferable?

C. Is there another, preferable, way of achieving z-standardisation before/after MI?
D. Should I be using z-standardisations at all with MI? 

Many thanks in advance for your help with this matter. 

Best wishes 

James 
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index