Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
weddings@stata.com (Wesley D. Eddings, StataCorp) |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: A correlation matrix after multiple imputation |

Date |
Mon, 26 Jul 2010 17:14:05 -0500 |

On Friday 23 July 2010, Alan Acock <acock@mac.com> asked if there is an easy way to obtain a multiple-imputation estimate of a correlation matrix in Stata: > Is there an easy way to obtain the 20 correlation matrices, one for each of > the 20 imputed datasets and then somehow pulling these? There is no automatic way of doing this, but with some programming effort, Alan can use -mi estimate- to obtain an MI estimate of the correlation matrix; see the code at the end of this post. However, from a statistical standpoint, it is not clear whether averaging completed-data sample estimates of the correlation matrix across imputed data is the best approach to account for missing data when computing a correlation matrix. One alternative is to consider reporting an EM estimate of the covariance (or correlation) matrix adjusted for missing data. Such an estimate can be obtained from -mi impute mvn-, because -mi impute mvn- uses the EM algorithm to get starting values of the parameters for the MCMC procedure. The EM estimates of the coefficients and the variance-covariance matrix are saved after -mi impute mvn- in the -r(Beta_em)- and -r(Sigma_em)- matrices, respectively. To obtain EM estimates only, without producing imputations, specify the -emonly- option with -mi impute mvn-; see the example below. -- Wes -- Yulia weddings@stata.com ymarchenko@stata.com =================== EXAMPLES ================================================ Here is how you can obtain an EM estimate of the correlation matrix accounting for missing data: /****************** begin do file ******************/ sysuse auto, clear set seed 12345 replace mpg = . if runiform()>0.9 mi set wide mi register imputed mpg weight mi impute mvn mpg weight, emonly mat Sigma = r(Sigma_em) /* save EM estimate of the variance-covariance (VC) matrix */ _getcovcorr Sigma, corr shape(full) /* convert VC to a correlation matrix */ mat C = r(C) matlist C /*************** end do file *****************/ Here is how you can obtain an MI estimate of the correlation matrix: /***** begin MI correlation ******************/ cap program drop ecorr program ecorr, eclass version 11 syntax [varlist] [if] [in] [aw fw] [, * ] if (`"`weight'"'!="") { local wgt `weight'`exp' } marksample touse correlate `varlist' `if' `in' `wgt', `options' tempname b V mata: st_matrix("`b'", vech(st_matrix("r(C)"))') local p = colsof(`b') mat `V' = J(`p',`p',0) local cols: colnames `b' mat rownames `V' = `cols' eret post `b' `V' [`wgt'] , obs(`=r(N)') esample(`touse') eret local cmd ecorr eret local title "Lower-diagonal correlation matrix" eret local vars "`varlist'" end cap program drop micorr program micorr, rclass tempname esthold _estimates hold `esthold', nullok restore qui mi estimate, cmdok: ecorr `0' tempname C_mi mata: st_matrix("`C_mi'", invvech(st_matrix("e(b_mi)")')) mat colnames `C_mi' = `e(vars)' mat rownames `C_mi' = `e(vars)' di di as txt "Multiple-imputation estimate of the correlation matrix" di as txt "(obs=" string(e(N_mi),"%9.0g") ")" matlist `C_mi' return clear ret matrix C_mi = `C_mi' end sysuse auto, clear set seed 12345 replace mpg = . if runiform()>0.9 mi set wide mi register imputed mpg weight mi impute mvn mpg weight, add(20) micorr mpg weight /***** end MI correlation ********************/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: RE: xt-overid** - Next by Date:
**st: question about working with dates and times** - Previous by thread:
**st: calculate the exact Poisson confidence intervals in stata** - Next by thread:
**st: question about working with dates and times** - Index(es):