Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Calculation of covariance matrix for unbalanced sample?


From   Cameron McIntosh <cnm100@hotmail.com>
To   STATA LIST <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Calculation of covariance matrix for unbalanced sample?
Date   Thu, 3 Nov 2011 09:19:20 -0400

I guess I would have been disappointed in Nick if my one-liner sans source material hadn't received that comment :) I don't know anything about the precise nature of the missingness in this case, but I might suggest one paper on a EM-ridge regression approach (I don't know about a Stata implementation, just Matlab), which *may* provide a more reliable end product:
Cole, S.R., Platt, R.W., Schisterman, E.F., Chu, H., Westreich, D., Richardson, D., & Poole, C. (2010). Illustrating bias due to conditioning on a collider. International Journal of Epidemiology, 39(2), 417-420. http://ije.oxfordjournals.org/content/39/2/417.full.pdf+htmlhttp://www.gps.caltech.edu/~tapio/imputation/
I think one might also use mi:
http://www.stata.com/stata11/mi.html
or chained equations:
Royston, P. (2009). Multiple imputation of missing values: Further update of ice, with an emphasis on categorical variables. The Stata Journal 9(3), 466–477.http://ideas.repec.org/c/boc/bocode/s446602.htmlhttp://www.stata-journal.com/article.html?article=st0067_4
White, I.R., Royston, P., & Wood, A.M. (2011). Multiple imputation using chained equations: Issues and guidance for practice. Statistics in Medicine, 30(4), 377–399.
I would be curious to see the differences in the finished product between these and the "unbalanced" suggestion. 
Cam

----------------------------------------
> From: n.j.cox@durham.ac.uk
> To: statalist@hsphsun2.harvard.edu
> Date: Thu, 3 Nov 2011 12:50:48 +0000
> Subject: RE: st: Calculation of covariance matrix for unbalanced sample?
>
> I don't think it's anything formal. I'd just say "unbalanced".
>
> I suppose that I should just add that I know Stephen Jenkins very well and that I know he won't need a miniature warning on the fact that such a covariance matrix is a dodgy beast unlikely to be fit for further analysis and with unreliable eigenproperties and that he's fully capable of explaining that to his colleague, should the colleague need such a warning.
>
> Where's the reading list? I was fully expecting a dozen references.
>
> Nick
> n.j.cox@durham.ac.uk
>
> Cameron McIntosh
>
> Nick, Stas
> Just curious. What's the estimation method being applied below: EM, FIML, MI...?
>
> From: n.j.cox@durham.ac.uk
>
> > -makematrix- (SJ) can do this. But it's better to use Stas' custom code, which is more direct.
>
> Stas Kolenikov
>
> > I don't think there's any. I vaguely remember a discussion some time
> > back on the list about this. Here's the basic outline from scratch:
> >
> > program define pwcovmat, rclasssyntax varlistunab vars :
> > `varlist'local p : word count `vars'tempname Covmatrix `Cov' =
> > J(`p',`p',.)matrix rownames `Cov' = `vars'matrix colnames `Cov' =
> > `vars'forvalues i=1/`p' { forvalues j=`i'/`p' { local x : word `i'
> > of `vars' local y : word `j' of `vars' quietly corr `x' `y', cov
> > matrix `Cov'[`i',`j'] = r(C) matrix `Cov'[`j',`i'] = r(C)
> > }}return matrix Cov = `Cov'end // of pwcovmat
> > sysuse auto
> > corr weight price mpg, cov
> > corr weight price mpg rep, cov
> > pwcovmat weight price mpg rep
> > matrix list r(Cov)
>
> On Thu, Nov 3, 2011 at 6:00 AM, <S.Jenkins@lse.ac.uk> wrote:
>
> > > A colleague has data on a relatively large number of variables. His
> > > sample is unbalanced in the sense that each variable has some missing
> > > values. He wishes to calculate the covariance matrix for his data but
> > > without the listwise deletion of cases that is imposed by -correlation,
> > > covariance-  or -matrix accum-.
> > >
> > > My first thought was that he could use -pwcorr- and loop over his
> > > variables, and build up his matrix from the saved results. But I thought
> > > there must be an easier or more straightforward way -- but Googling and
> > > -findit- have not suggested any.  I guess there is a relatively easy
> > > Mata solution, but I am currently unfamiliar with that route.
> > >
> > > Suggestions using Stata or Mata please
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
 		 	   		  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index