[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: -matsusort- added to -matvsort- package on SSC |

Date |
Fri, 24 Sep 2004 16:26:43 +0100 |

Roger Harbord's posting on -matvsort- had me scuttling to look and see what on Earth it did. That reminded indirectly of a utility I have long wanted, on and off, but have never seen or got round to writing: a program to sort the rows and/or columns of a matrix according to some summary of the elements in the rows and/or columns. Here's a very simple example. Suppose we look at the car size variables in the auto data: . sysuse auto (1978 Automobile Data) . corr head trunk weight length displacement (obs=74) | headroom trunk weight length displa~t -------------+--------------------------------------------- headroom | 1.0000 trunk | 0.6620 1.0000 weight | 0.4835 0.6722 1.0000 length | 0.5163 0.7266 0.9460 1.0000 displacement | 0.4745 0.6086 0.8949 0.8351 1.0000 For display, we might want to reorder that matrix, for example to get clusters of high correlations and low correlations together, as far as possible. (We might also want fewer than 4 d.p.) The first could be achieved by detailed inspection and re-typing the variable names in different order, but an automated solution is also desirable, especially for much bigger problems. One first step is to get the correlations into a matrix in the sense of Stata's -matrix- commands. There are several ways to do that. One is -matcorr- from STB-56: . matcorr head trunk weight length displacement , matrix(corr) (obs=74) < same matrix, naturally > Then -matsusort- (now added to the -matvsort- package on SSC, thanks to Kit Baum) sorts the rows according to their means. That is, for each row { calculate the mean of the row elements } sort the rows according to the order of their means The -decrease- option controls which way they are sorted, and the the -columns- option does it by columns. . matsusort corr scorr, dec . matsusort scorr scorr, col dec We now have more control e.g. over format from -matrix list-: . mat li scorr , format(%9.3f) symmetric scorr[5,5] length weight displacement trunk headroom length 1.000 weight 0.946 1.000 displacement 0.835 0.895 1.000 trunk 0.727 0.672 0.609 1.000 headroom 0.516 0.483 0.474 0.662 1.000 . mat li scorr , format(%9.3f) nohalf symmetric scorr[5,5] length weight displacement trunk headroom length 1.000 0.946 0.835 0.727 0.516 weight 0.946 1.000 0.895 0.672 0.483 displacement 0.835 0.895 1.000 0.609 0.474 trunk 0.727 0.672 0.609 1.000 0.662 headroom 0.516 0.483 0.474 0.662 1.000 The sorting by means is just the default. There is a handle allowing you to sort according to _any_ summary measure produced by -summarize-. It's unlikely that anyone would choose to sort by kurtosis, but the generality is cheap. You can this bundled with the other stuff previously in -matvsort- by . ssc inst matvsort or . ssc inst matvsort, replace Nick n.j.cox@durham.ac.uk * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: generating one twoway graph using by** - Next by Date:
**RE: st: generating one twoway graph using by** - Previous by thread:
**st: RE: generating one twoway graph using by** - Next by thread:
**st: mvsumm, mvcorr updated** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |