Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Richard Goldstein <richgold@ix.netcom.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Interactions and multiple-imputation |
Date | Wed, 23 Mar 2011 21:07:36 -0400 |
take a look at White, IR, Royston, P and Wood, AM (2011), "Multiple imputation using chained equations: issues and guidance for practice," _Statistics in Medicine_, 30: 377-399 Rich On 3/23/11 8:34 PM, Nic wrote: > Hello all, > > In Alan C. Acock’s “A Gentle Introduction to Stata” (2010:367), it is > recommended to create interaction terms in the original dataset before > doing the multiple-imputation stage. That’s how I’ve proceeded thus far, > but I’m curious if I should in fact be doing so. I'll explain why below. > > My survey dataset contains multiple measures of the same construct. For > example, 5 questions are used to measure the extent of childhood > physical abuse. In my non-multiply-imputed dataset I have created a > single "physical abuse" scale that is the average of the 5 component > variables. I have a small number of cases in which all 5 component > variables are missing. I have other cases in which the respondent > answered some but not all of the 5 component questions. For these cases > it seems as though I should be imputing the missing values for the > component variables and *then* creating the final scale by averaging the > complete sets of 5 questions. Otherwise, I will end up with some cases > in which the scale is completed but is based on averaging less than the > 5 component questions and will not receive the benefit of imputation. > > However, my interaction terms are the products of these types of scales > (like "physical abuse"). And as I mentioned at the beginning of this > email, the best advice according to Acock is to create interaction terms > in the original dataset and then impute the missing interaction terms. > > So I cannot do it both ways. I can either: > 1. Create my interaction terms in the original dataset based on > component variables which may themselves be comprised of missing values > and then impute the missing interaction terms. > or > 2. Impute missing values in the original dataset with no scales or > interaction terms created. Then, with the multiply-imputed dataset, > create scales and then create interaction terms. > > Option 2 seems to make more sense to me, but I thought it was a good > idea to post here before I defy the advice found in Acock's book. I also > suspect that the proper solution may be more complex than I realise. > > With thanks, > Nic * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/