# Re: st: RE: correlate by group and collapse

 From Nick Cox To "statalist@hsphsun2.harvard.edu" Subject Re: st: RE: correlate by group and collapse Date Fri, 2 Dec 2011 06:52:56 +0000

The syntax should be corr(var1 var2) and otherwise as stated.

Nick

On 2 Dec 2011, at 05:04, Rui Zeng <rzeng@wisc.edu> wrote:


```Dear Statalisters,

I see the following code and I want to use the
_gcorr.ado file to get the covariance by group. However, in the
following syntax there is not place to input the command, and I do not

The syntax is:


[by varlist:] egen newvar = var1 var2 [if exp] [in exp] [ ,covariance ]




Dear Statalisters,

I want to collapse my dataset by a group variable and retain the
correlation coefficient of two variables.  In other
words, I'd like to be able to do something like:
. collapse (correlation) var1 var2, by(group)
or maybe:
. by group: egen corr12=corr(var1 var2)
. collapse corr12, by(group)

However, collapse doesn't have correlation among its stats (it only
allows a selection of univariate statistics) and egen doesn't have a
corr function.
I know I can do:
. by group: correlate var1 var2
- but I want to save the results and do further analysis on
them rather
than just displaying them.

The best I've come up with is (supposing I have 100 groups):
. gen corr12=.
. for num 1/100, noheader: qui correlate var1 var2 if group==X \
qui replace corr12=r(rho) if study==X
. collapse corr12, by(group)

This seems kind of clumsy though, and it took me a while to work out
that I needed _noheader_ and _quietly_ to stop my screen filling with
output. It also becomes quite lengthy if I want several pairwise
correlations. Is there a better way?

I think I'd like egen to have a _corr_ and/or a _cov_ function - I
would have thought it would be of wider interest than the calculation
of U.S. marginal income tax rates, which is already
implemented as egen
function mtr! I've checked the extensions to egen in the STB package
_egenodd_ and tried a couple of _findit_'s, but I didn't find
anything
suitable.


I've attached below a program to do this with egen.  Save the whole
```
thing as "_gcorr.ado" (that is, DO NOT separate out the GenCorr part as
```a separate file.

The syntax is:

[by varlist:] egen newvar = var1 var2 [if exp] [in exp] [ ,
covariance ]

The ", covariance" option generates coveriances; otherwise it does
correlations.

Nick Winter

**************************************
*! NJGW 10jul2002

*! syntax: [by varlist:] egen newvar = var1 var2 [if exp] [in exp] [ , covariance ]
```covariance ]
*! computes correlation (or covariance) between var1 and var2,
optionally by: varlist
*!    and stores the result in newvar.
program define _gcorr
version 7

gettoken type 0 : 0
gettoken g    0 : 0
gettoken eqs  0 : 0
syntax varlist(min=2 max=2) [if] [in] [, BY(string) Covariance ]

if `"`by'"'!="" {
local by `"by `by':"'
}

quietly {
gen `type' `g' = .
`by' GenCorr `varlist' `if' `in', thevar(`g')
`covariance'
}
capture label var `g' "Correlation `varlist'"
end

program define GenCorr, byable(recall)
syntax varlist [if] [in] , thevar(string) [ covariance ]
marksample touse
if "`covariance'"=="" {
local stat "r(rho)"
}
else {
local stat "r(cov_12)"
}
cap corr `varlist' if `touse' , `covariance'
if !_rc {
qui replace `thevar'=``stat'' if `touse'
}
end

```*