# RE: st: difference between "Spearman" and "pwcorr / correlate"

 From "Nick Cox" To Subject RE: st: difference between "Spearman" and "pwcorr / correlate" Date Thu, 8 Oct 2009 16:13:08 +0100

```(a) is on all fours with "the problem with pink is that it isn't blue".
That is, (a) amounts to saying that the problem with Spearman's rank is
that it's not Kendall's tau. True, but the reverse is equally true.

That aside, I think most users of rank correlation would be happy to
indeed to note that they should give similar results in practice. For
example, given the property emphasised earlier in the thread that

Spearman(x, y) = Pearson(rank(x), rank(y))

one of many possibilities for Spearman correlations is that they offer a
route to a robustified PCA. (You can be sure that the eigenproperties
are OK.)

Nick
n.j.cox@durham.ac.uk

Newson, Roger B

There IS an interpretation of the Spearman correlation for continuous
variables in an infinite population. In that case, if the random
variables are X and Y, then the Spearman rho(X,Y) is simply the Pearson
correlation of F_X(X) and F_Y(Y), where F_X(.) and F_Y(.) are the
population cumulative distribution functions of X and Y respectively.
And a Pearson correlation, as always, is a measure of linearity.

The two main problems with the Spearman rho are that (a) it is ONLY a
measure of linearity between 2 cumulative distribution functions (with
no interpretation as a difference between concordance and discordance
probabilities), and that (b) the Central Limit Theorem works a lot less
quickly for the sample Spearman rho than for the sample Kendall tau-a,
especially under the null hypothesis of zero correlation (see Kendall
and Gibbons, 1990).

References

Kendall, M. G., and J. D. Gibbons. 1990. Rank Correlation Methods. 5th
ed. Oxford, UK: Oxford University Press.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```