# Re: st: RE: RE: summarize correlation matrix

 From Nick Cox
To statalist@hsphsun2.harvard.edu
Subject Re: st: RE: RE: summarize correlation matrix
Date Sat, 19 May 2012 07:03:29 +0100

```Here's how to do it with -corrci- once installed from SJ:

sysuse auto, clear
ds, has(type numeric)
corrci `r(varlist)', saving(easierthisway)
use easierthisway
su corr, detail

Nick

On Fri, May 18, 2012 at 6:59 PM, Cohen, Elan wrote:

> Thank you all for your wise words and cautious notes.  Unfortunately, I'm just the programmer in this case.
>
> For any interested, here's how I converted a correlation matrix into a Stata variable.
>
>
> sysuse auto, clear
> ds, has(type numeric)
> keep `r(varlist)'
> corr
> mat C = r(C)
> clear
> svmat C
> // Only keep lower half
> forv i=1/`c(k)' {
>        qui replace C`i' = . in 1/`i'
> }
> g i = _n
> reshape long C, i(i) j(junk)
> drop i junk
> drop if mi(C)
> su C, d

Nick Cox

> -corrci- (SJ) makes this easy. -corrci- was written primarily to support confidence interval calculation for correlations but has an option to save a correlation matrix to a dataset. The analyses allowed by -corrci- raise the question of whether you should be summarizing correlations on a z scale, not an r scale.
>
> See
>
> SJ-10-4 pr0041_1  . . . . . . . . . . . . . . . . . Software update for corrci
>        (help corrci, corrcii if installed) . . . . . . . . . . . .  N. J. Cox
>        Q4/10   SJ 10(4):691
>        update to fix corrci so that it always saves r-class results
>
> SJ-8-3  pr0041  .  Speaking Stata: Corr. with confidence, Fisher's z revisited
>        (help corrci, corrcii if installed) . . . . . . . . . . . .  N. J. Cox
>        Q3/08   SJ 8(3):413--439
>        reviews Fisher's z transformation and its inverse, the
>        hyperbolic tangent, and reviews their use in inference
>        with correlations
>
> The suggestion is that you read the paper in SJ 8-3 ( which is accessible at http://www.stata-journal.com/sjpdf.html?articlenum=pr0041 )
>
> but download the updated software from the files for SJ 10-4.
>
> All that said, I am not sure quite what useful information is given by some of these numbers. I think of problems in which showing the overall distribution of the correlations is of some descriptive value, which was precisely why I provided that option, but the standard deviation? Also, if correlations are generally weak, then either you are summarizing noise or you are asking the wrong questions about relationships!

Cohen, Elan

> I have a large correlation matrix after running -corr- on my dataset.  I'd now like to summarize that matrix, i.e. get the mean, sd, plot a histogram of the p*(p-1)/2 correlation values (where p is the number of variables).  I'm not quite savvy enough in Mata and it's proving difficult in standard Stata programming.  I'd greatly appreciate an answer using either.

```