Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: st: Using MVN for Multiple Missing Ordinal Variables


From   "Clifton Chow" <clifton_chow@post.harvard.edu>
To   statalist@hsphsun2.harvard.edu
Subject   st: st: Using MVN for Multiple Missing Ordinal Variables
Date   Sun, 08 May 2011 15:45:32 -0500

Thanks to Nick, Mosi & Daniel for your responses:

I should have clarified that the ordinal data comes from a battery of health related surveys implemented across 9 sites to approximately 150 individuals at each site.  They all ask respondents to rate various conditions on Likert (thanks for the reminder on the orgin of the term) scales, some of which are statements ranging from "strongly disagree" to "strongly agree," others "Never" to "Always."  

My goal is to impute the missing values on these items in order to feed them into a program to calculate mean difference effect sizes for a meta-analysis to determine the impact (if any) between randomly assigned treated groups and  controls.  I have poured through Allison and have examined the pattern of missing as best I could.  Unfortunately, there is low correlation between missing and demographic or clinical variables, and very little nesting among missing variables.  I had considered univariate missing approach through MI impute ologit, but that's potentially 450 models across 9 sets of instruments, and with little or no correlation among demographic/clinical variables, it does not seem very useful.  I am hardly an expert on MI so I welcome your suggestions, and will take another look at Allison's discussion on imputing categorical variables. 

Thanks also for the mi mvncat user written .ado. 

---------------------------------------------------------------------------------------------------------------------------------------------------

I would reverse this question. On what grounds could multivariate
normal possibly seem right for ordinal data? Why are you considering
such distributions at all? Presumably you have some information on the
distributions of your ordinal variables. If they seem like
integer-rounded variants of normals, then MVN might be the best thing
you can come up with, but it hardly seems appropriate or attractive
otherwise.

Likert scales are named for Rensis Likert, so capitalisation is recommended.

Nick

--------------------------------------------------------------------------------------------------------------------------------------------

First of all, I am all but an expert on multiple imputation -- however
interested in it. So I hope there will also be answers from more
experienced people.

As far as I know, MVN is designed for continuos variables. However
Allison (2002:38-40) describes an ad-hoc solution to impute
categorical variables. I have an .ado file assigning "final values" to
such imputed variables (-findit mi mvncat-).

I can think of other possibilities however.

If your variables are not "strictly" categorical (such as race in the
NLSW88.dta -sysuse nlsw88-), you might want to consider treating them
as if they were continous. For example a variable with levels 1.
"strongly disagree" to 7. "strongly agree" migth well be treated
continous (as you woud do using them in a regression) in which case
-mi impute mvn- is fine without any afterward "ad-hoc" corrections.

You might also want to use Royston's (e.g. 2005) -ice- (findit -ice-,
also see -help mi ice-) to impute missing values using
chaind-equations. This allows you to impute missing values using
ordinal regression modells. From what I understand, chaind-equations
perform pretty well in simulations, yet are not theoretically
established.

I did not fully understand what kind of descriptive statistics you
want to calculate and report. Concerning "descriptive" statistics in
general, as far as I know, there is no need to adjust the variance. If
you are only interessted in point estimates (not their standard errors
and therefore not in statistical inference) you just average the
respective point estimates. This is what  Yulia Marchenko's -mibeta-
(-findit mibeta-) does, reporting R-squared messures for -mi estimate
regress- (see http://www.stata.com/support/faqs/stat/mi_combine.html).
I have an .ado reporting summary statistics for the dataset cobining
results from -summarize- (-findit misum-), where I do not adjust the
(sample) variance.

Best
Daniel

References

Allison, Paul D. (2002) Missing Data. Thousand Oaks, CA:  Sage Publications.

Royston, P. (2005) Multiple imputation of missing values: update.
Stata Journal 5 (2), 188-201.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index