[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: difference between "Spearman" and "pwcorr / correlate" |

Date |
Fri, 9 Oct 2009 11:41:37 +0100 |

My point needs rephrasing. I draw a distinction between verbal definitions or characterisations on the one hand and verbal analogies on the other. The difference lies in whether you can take the verbal statements and reconstruct the formula or method from them; with mere analogies you can't do that. However, Pearson correlations are pretty much defined by their square being the fraction of variance explained by the corresponding regression, modulo sign of course. In contrast, if I explain Spearman correlation in verbal terms as a measure of monotonicity that does not imply the particular formula used. Nick n.j.cox@durham.ac.uk Stas Kolenikov On Thu, Oct 8, 2009 at 11:33 AM, Nick Cox <n.j.cox@durham.ac.uk> wrote: > There's a tacit criterion here, that techniques must have simple verbal > interpretations. I am as much in favour of simple verbal interpretations > as the next person -- nay, on average, more so -- but while they're a > bonus when available insisting on them would deprive you of much that is > indispensable. > > What's the simple verbal interpretation of (say) eigenvectors or an SVD? The eigenproblems are very visual. The eigenvalues tell you by how much a unit vector will change its length, and eigenvectors give those specific vectors and directions of where the change is exact: the vector stretches without any rotation. If we talk about an eigenproblem for a covariance matrix, then the eigenvalues are the "radii" of an rugby/American football of the points in multivariate space, and eigenvectors are again directions that give the orientation of that rugby ball relative to the "official" axes. SVDs can be explained by the -biplot-s, although with greater effort. I usually want to know what I am estimating. Then I can eyeball something along the lines of "the difference between the unknown population distribution function and the sample distribution is such and such, and hence by an appropriate version of the influence function expansions and/or the delta-method, the difference between the unknown parameter and the estimate at hand must be of such and such order." Thanks to Roger, I now have a better clue of what I am estimating with Spearman correlation. And there are probably a dozen other rank-type correlations that would make at least as much sense as (linear) correlation of the cdfs. One other comparison can be made regarding the computational requirements. Spearman's rho is O( n log(n) ) due to sorting, while Kendall's tau is O( n^2 ) for the pairwise comparisons. Of course Pearson's moment correlation is O( n ), it's just manipulation of sums. One would only see differences in timing of Pearson and Spearman with the sample sizes such that -sort- takes a noticeable amount of time, while Kendall's tau is slow with more than 100 observations. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re:st: difference between "Spearman" and "pwcorr / correlate"***From:*Mike Lacy <Michael.Lacy@colostate.edu>

**Re: st: difference between "Spearman" and "pwcorr / correlate"***From:*Stas Kolenikov <skolenik@gmail.com>

**RE: st: difference between "Spearman" and "pwcorr / correlate"***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**Re: st: difference between "Spearman" and "pwcorr / correlate"***From:*Stas Kolenikov <skolenik@gmail.com>

- Prev by Date:
**RE: AW: st: Stata Inbuilt commands** - Next by Date:
**st: STATA help needed urgently** - Previous by thread:
**Re: st: difference between "Spearman" and "pwcorr / correlate"** - Next by thread:
**st: gllamm and gologit** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |