[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Stas Kolenikov <skolenik@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: difference between "Spearman" and "pwcorr / correlate" |

Date |
Thu, 8 Oct 2009 14:39:04 -0500 |

On Thu, Oct 8, 2009 at 11:33 AM, Nick Cox <n.j.cox@durham.ac.uk> wrote: > There's a tacit criterion here, that techniques must have simple verbal > interpretations. I am as much in favour of simple verbal interpretations > as the next person -- nay, on average, more so -- but while they're a > bonus when available insisting on them would deprive you of much that is > indispensable. > > What's the simple verbal interpretation of (say) eigenvectors or an SVD? The eigenproblems are very visual. The eigenvalues tell you by how much a unit vector will change its length, and eigenvectors give those specific vectors and directions of where the change is exact: the vector stretches without any rotation. If we talk about an eigenproblem for a covariance matrix, then the eigenvalues are the "radii" of an rugby/American football of the points in multivariate space, and eigenvectors are again directions that give the orientation of that rugby ball relative to the "official" axes. SVDs can be explained by the -biplot-s, although with greater effort. I usually want to know what I am estimating. Then I can eyeball something along the lines of "the difference between the unknown population distribution function and the sample distribution is such and such, and hence by an appropriate version of the influence function expansions and/or the delta-method, the difference between the unknown parameter and the estimate at hand must be of such and such order." Thanks to Roger, I now have a better clue of what I am estimating with Spearman correlation. And there are probably a dozen other rank-type correlations that would make at least as much sense as (linear) correlation of the cdfs. One other comparison can be made regarding the computational requirements. Spearman's rho is O( n log(n) ) due to sorting, while Kendall's tau is O( n^2 ) for the pairwise comparisons. Of course Pearson's moment correlation is O( n ), it's just manipulation of sums. One would only see differences in timing of Pearson and Spearman with the sample sizes such that -sort- takes a noticeable amount of time, while Kendall's tau is slow with more than 100 observations. -- Stas Kolenikov, also found at http://stas.kolenikov.name Small print: I use this email account for mailing lists only. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: difference between "Spearman" and "pwcorr / correlate"***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**References**:**Re:st: difference between "Spearman" and "pwcorr / correlate"***From:*Mike Lacy <Michael.Lacy@colostate.edu>

**Re: st: difference between "Spearman" and "pwcorr / correlate"***From:*Stas Kolenikov <skolenik@gmail.com>

**RE: st: difference between "Spearman" and "pwcorr / correlate"***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

- Prev by Date:
**Re: st: -syntax , option(varlist min=0)-?** - Next by Date:
**Re: st: -syntax , option(varlist min=0)-?** - Previous by thread:
**RE: st: difference between "Spearman" and "pwcorr / correlate"** - Next by thread:
**RE: st: difference between "Spearman" and "pwcorr / correlate"** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |