Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

RE: Re: st: RE: Selecting correlations with highest absolute value

 From Joe Canner <[email protected]> To "[email protected]" <[email protected]> Subject RE: Re: st: RE: Selecting correlations with highest absolute value Date Wed, 9 Oct 2013 23:40:20 +0000

```Dara,

Red Owl beat me to the answer I was going to give.  If you have a good reason to use -pwcorr- instead of -corr-, then you might need something more complicated in which you loop over all your variables, accumulating pairwise correlations.

foreach x of varlist tbmale...etc {
foreach y of varlist tbmale...etc {
corr `x' 'y'
matrix corrvector=corrvector \ vec(r(C))
}
}
matvsort corrvector sortedvector
matrix list sortedvector

I don't have the ability to test this at the moment and I don't have matrix syntax memorized, so this might need some tweaking, particularly the matrix command inside the loops.  I'm also not sure if you will need to initialize -correvector- before starting the loops.  Let us know if have any problems and I'm sure someone can help.

Joe
________________________________________
From: [email protected] [[email protected]] on behalf of Dara Shifrer [[email protected]]
Sent: Wednesday, October 09, 2013 7:04 PM
To: [email protected]
Subject: Fwd: Re: st: RE: Selecting correlations with highest absolute value

Joe, thank you very much for your quick response to my quest to find the
most highly correlated pairs of variables. I  think I understand what
your code does (finds correlations, linearly transforms the correlation
matrix into a column vector, sorts this matrix, and then lists the
sorted columns of correlations) but I'm not sure why it isn't working
for me (see code below). I haven't used Stata's matrix commands before
and may be missing something obvious. Thanks for any additional help
anyone can provide! Dara

pwcorr tbmale tdedc3 tbrace td9tchr td9slry tb9yrsh tb9yrsnh td10tchr
td10slry ///
tb10yrsh tb10yrsnh td11tchr td11slry tb11yrsh tb11yrsnh ///
tp10pswm ta10a2w skd10size skd10blck skd10hisp skd10pvty skd10lep  ///
skd10biesl skd10gt skd10sped skd11size skd11blck skd11hisp skd11pvty
skd11lep  ///
skd11biesl skd11gt skd11sped skd12size skd12blck skd12hisp skd12pvty
skd12lep  ///
skd12biesl skd12gt skd12sped ta11elgb5 ta11ctgr ta11grd ta11chrt
ta11sclvl ///
ta11a1rg ta11a2rg ta11a2lrg ta11a2mrg ta11a2m9rg ta11a2m10rg ta11a2m11rg ///
ta11a2rrg ta11a2r9rg ta11a2r10rg ta11a2r11rg ta11a2srg ta11a2s10rg
ta11a2s11rg ///
ta11a2ssrg ta11a2ss10rg ta11a2ss11rg ///
ta11a3rg ta11a3arg ta11a3arrg ta11a3amrg ta11a3aparg ta11a3aperg
ta11a3brg ta11a3crg ///
trt12rtn tp12pswm tka12tme tka12tmebl tka12tms tka12tmsbl tka12tre
tka12trebl tka12trs ///
tka12trsbl tka12talg1 tka12talg1bl tka12tbio tka12tbiobl tka12te1r ///
tka12te1rbl tka12te1w tka12te1wbl tka12twgeo tka12twgeobl ///
tka12smegn tka12smsgn tka12sregn tka12srsgn tka12slegn tka12slsgn ///
tka12ssegn tka12shegn tka12shsgn

.... lots of correlations excluded...

| tka~megn tka~msgn tka~regn tka~rsgn tka~legn tka~lsgn tka~segn
-------------+---------------------------------------------------------------
tka12smegn |   1.0000
tka12smsgn |   0.1390   1.0000
tka12sregn |   0.6082   0.1509   1.0000
tka12srsgn |   0.1211   0.5660   0.1929   1.0000
tka12slegn |   0.5454  -0.0638   0.5637   0.1009   1.0000
tka12slsgn |   0.2572   0.5671   0.2427   0.5295   0.2006 1.0000
tka12ssegn |   0.4479  -0.1376   0.3819  -0.1273   0.4028 -0.1095
1.0000
tka12shegn |   0.4143   0.0340   0.4330  -0.2011   0.4543 -0.2584
0.5530
tka12shsgn |   0.5705   0.4077   0.3127   0.6170   0.2309 0.4094
0.2407

| tka~hegn tka~hsgn
-------------+------------------
tka12shegn |   1.0000
tka12shsgn |   0.0918   1.0000

. matrix corrvector=vec(r(C))

. matvsort corrvector sortedvector

. matrix list sortedvector

sortedvector[4,1]
c1
tka12shsgn:tka12shsgn   1
tka12shsgn:tka12shsgn   1
tka12shsgn:tka12shsgn   1
tka12shsgn:tka12shsgn   1

Postdoctoral Fellow, Houston Education Research Consortium
Kinder Institute for Urban Research
Rice University
[email protected]

On 10/8/2013 1:39 PM, Joe Canner wrote:
> Dara,
>
> Here's one quick-n-dirty possibility. (It requires installing -matvsort- from SSC.)
>
> . corr varlist
> . matrix corrvector=vec(r(C))
> . matvsort corrvector sortedvector
> . matrix list sortedvector
>
> Regards,
> Joe Canner
> Johns Hopkins University School of Medicine
>
>
> -----Original Message-----
> From:[email protected]  [mailto:[email protected]] On Behalf Of Dara Shifrer
> Sent: Tuesday, October 08, 2013 3:16 PM
> To:[email protected]
> Subject: st: Selecting correlations with highest absolute value
>
>
> In SAS, I was able to quickly determine which pairs of variables were
> most highly correlated using the 'best' option with the 'proc corr'
> command ("*BEST=*/n ----/**/**/prints */n/* correlation coefficients for
> each variable. Correlations are ordered from highest to lowest in
> absolute value.) After extensive searching, I have not been able to
> locate a Stata command that does something similar.
>
> If this is not possible in Stata, maybe Stata experts have suggestions
> for my ultimate purpose: constructing equations to facilitate a smoother
> and faster running of Stata's 'ice' command.
>
> Any help would be greatly appreciated,
> Dara Shifrer
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```