Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Selecting correlations with highest absolute value


From   Dara Shifrer <Dara.Shifrer@rice.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: Selecting correlations with highest absolute value
Date   Thu, 10 Oct 2013 13:19:16 -0600

Joe and Red Owl,

The code is working now that I'm using corr rather than pwcorr (and excluding some variables inapplicable to too few cases to be included) - thanks very much for your help!!!

Dara

Postdoctoral Fellow, Houston Education Research Consortium
Kinder Institute for Urban Research
Rice University
Dara.Shifrer@rice.edu

On 10/9/2013 5:40 PM, Joe Canner wrote:
Dara,

Red Owl beat me to the answer I was going to give.  If you have a good reason to use -pwcorr- instead of -corr-, then you might need something more complicated in which you loop over all your variables, accumulating pairwise correlations.

foreach x of varlist tbmale...etc {
   foreach y of varlist tbmale...etc {
     corr `x' 'y'
     matrix corrvector=corrvector \ vec(r(C))
   }
}
matvsort corrvector sortedvector
matrix list sortedvector

I don't have the ability to test this at the moment and I don't have matrix syntax memorized, so this might need some tweaking, particularly the matrix command inside the loops.  I'm also not sure if you will need to initialize -correvector- before starting the loops.  Let us know if have any problems and I'm sure someone can help.

Joe
________________________________________
From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] on behalf of Dara Shifrer [Dara.Shifrer@rice.edu]
Sent: Wednesday, October 09, 2013 7:04 PM
To: statalist@hsphsun2.harvard.edu
Subject: Fwd: Re: st: RE: Selecting correlations with highest absolute value

Joe, thank you very much for your quick response to my quest to find the
most highly correlated pairs of variables. I  think I understand what
your code does (finds correlations, linearly transforms the correlation
matrix into a column vector, sorts this matrix, and then lists the
sorted columns of correlations) but I'm not sure why it isn't working
for me (see code below). I haven't used Stata's matrix commands before
and may be missing something obvious. Thanks for any additional help
anyone can provide! Dara

pwcorr tbmale tdedc3 tbrace td9tchr td9slry tb9yrsh tb9yrsnh td10tchr
td10slry ///
tb10yrsh tb10yrsnh td11tchr td11slry tb11yrsh tb11yrsnh ///
tp10pswm ta10a2w skd10size skd10blck skd10hisp skd10pvty skd10lep  ///
skd10biesl skd10gt skd10sped skd11size skd11blck skd11hisp skd11pvty
skd11lep  ///
skd11biesl skd11gt skd11sped skd12size skd12blck skd12hisp skd12pvty
skd12lep  ///
skd12biesl skd12gt skd12sped ta11elgb5 ta11ctgr ta11grd ta11chrt
ta11sclvl ///
ta11a1rg ta11a2rg ta11a2lrg ta11a2mrg ta11a2m9rg ta11a2m10rg ta11a2m11rg ///
ta11a2rrg ta11a2r9rg ta11a2r10rg ta11a2r11rg ta11a2srg ta11a2s10rg
ta11a2s11rg ///
ta11a2ssrg ta11a2ss10rg ta11a2ss11rg ///
ta11a3rg ta11a3arg ta11a3arrg ta11a3amrg ta11a3aparg ta11a3aperg
ta11a3brg ta11a3crg ///
trt12rtn tp12pswm tka12tme tka12tmebl tka12tms tka12tmsbl tka12tre
tka12trebl tka12trs ///
tka12trsbl tka12talg1 tka12talg1bl tka12tbio tka12tbiobl tka12te1r ///
tka12te1rbl tka12te1w tka12te1wbl tka12twgeo tka12twgeobl ///
tka12smegn tka12smsgn tka12sregn tka12srsgn tka12slegn tka12slsgn ///
tka12ssegn tka12shegn tka12shsgn

.... lots of correlations excluded...

         | tka~megn tka~msgn tka~regn tka~rsgn tka~legn tka~lsgn tka~segn
-------------+---------------------------------------------------------------
    tka12smegn |   1.0000
    tka12smsgn |   0.1390   1.0000
    tka12sregn |   0.6082   0.1509   1.0000
    tka12srsgn |   0.1211   0.5660   0.1929   1.0000
    tka12slegn |   0.5454  -0.0638   0.5637   0.1009   1.0000
    tka12slsgn |   0.2572   0.5671   0.2427   0.5295   0.2006 1.0000
    tka12ssegn |   0.4479  -0.1376   0.3819  -0.1273   0.4028 -0.1095
1.0000
    tka12shegn |   0.4143   0.0340   0.4330  -0.2011   0.4543 -0.2584
0.5530
    tka12shsgn |   0.5705   0.4077   0.3127   0.6170   0.2309 0.4094
0.2407

               | tka~hegn tka~hsgn
-------------+------------------
    tka12shegn |   1.0000
    tka12shsgn |   0.0918   1.0000

. matrix corrvector=vec(r(C))

. matvsort corrvector sortedvector

. matrix list sortedvector

sortedvector[4,1]
                         c1
tka12shsgn:tka12shsgn   1
tka12shsgn:tka12shsgn   1
tka12shsgn:tka12shsgn   1
tka12shsgn:tka12shsgn   1


Postdoctoral Fellow, Houston Education Research Consortium
Kinder Institute for Urban Research
Rice University
Dara.Shifrer@rice.edu

On 10/8/2013 1:39 PM, Joe Canner wrote:
Dara,

Here's one quick-n-dirty possibility. (It requires installing -matvsort- from SSC.)

. corr varlist
. matrix corrvector=vec(r(C))
. matvsort corrvector sortedvector
. matrix list sortedvector

Regards,
Joe Canner
Johns Hopkins University School of Medicine


-----Original Message-----
From:owner-statalist@hsphsun2.harvard.edu  [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Dara Shifrer
Sent: Tuesday, October 08, 2013 3:16 PM
To:statalist@hsphsun2.harvard.edu
Subject: st: Selecting correlations with highest absolute value


In SAS, I was able to quickly determine which pairs of variables were
most highly correlated using the 'best' option with the 'proc corr'
command ("*BEST=*/n ----/**/**/prints */n/* correlation coefficients for
each variable. Correlations are ordered from highest to lowest in
absolute value.) After extensive searching, I have not been able to
locate a Stata command that does something similar.

If this is not possible in Stata, maybe Stata experts have suggestions
for my ultimate purpose: constructing equations to facilitate a smoother
and faster running of Stata's 'ice' command.

Any help would be greatly appreciated,
Dara Shifrer



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index