[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Newson, Roger B" <r.newson@imperial.ac.uk> |

To |
"'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: difference between "Spearman" and "pwcorr / correlate" |

Date |
Wed, 7 Oct 2009 22:31:34 +0100 |

There IS an interpretation of the Spearman correlation for continuous variables in an infinite population. In that case, if the random variables are X and Y, then the Spearman rho(X,Y) is simply the Pearson correlation of F_X(X) and F_Y(Y), where F_X(.) and F_Y(.) are the population cumulative distribution functions of X and Y respectively. And a Pearson correlation, as always, is a measure of linearity. The two main problems with the Spearman rho are that (a) it is ONLY a measure of linearity between 2 cumulative distribution functions (with no interpretation as a difference between concordance and discordance probabilities), and that (b) the Central Limit Theorem works a lot less quickly for the sample Spearman rho than for the sample Kendall tau-a, especially under the null hypothesis of zero correlation (see Kendall and Gibbons, 1990). Best wishes Roger References Kendall, M. G., and J. D. Gibbons. 1990. Rank Correlation Methods. 5th ed. Oxford, UK: Oxford University Press. Roger B Newson BSc MSc DPhil Lecturer in Medical Statistics Respiratory Epidemiology and Public Health Group National Heart and Lung Institute Imperial College London Royal Brompton Campus Room 33, Emmanuel Kaye Building 1B Manresa Road London SW3 6LR UNITED KINGDOM Tel: +44 (0)20 7352 8121 ext 3381 Fax: +44 (0)20 7351 8322 Email: r.newson@imperial.ac.uk Web page: http://www.imperial.ac.uk/nhli/r.newson/ Departmental Web page: http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/ Opinions expressed are those of the author, not of the institution. -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Stas Kolenikov Sent: 07 October 2009 21:27 To: statalist@hsphsun2.harvard.edu Subject: Re: st: difference between "Spearman" and "pwcorr / correlate" > >Inference for Pearson's moment correlation relies on normality of the > >data. Spearman rank correlation is free of any assumptions, but there > >is no population characteristic that it estimates, which makes > >interpretation and asymptotic inference somewhat weird. If one is > >significant and the other is not, you are making either type I or type > >II error somewhere. > In the angels on the head of a pin vein: > Of possible interest in this regard is that the Spearman coefficient is the > same as the Pearson calculated on the ranked values of the variables (ties > getting the average rank). I would agree that this is not a terribly > interesting population parameter, but isn't this nevertheless an > estimable/testable population characteristic? If you have a finite population, then of course you will have Spearman correlation for it. Although if you want to set up any asymptotic framework, you will be trying to hit a moving target. I don't think there is a meaningful definition of Spearman correlation for infinite populations/continuous variables, although I might be mistaken. On the other hand, Kendall's tau, as Nick Cox quoted from Roger Newson, has explicit population analogues in probabilities of concordant and discordant pairs of observations. The question is: if the correlation estimate is 0.5, what does it say? For Pearson moment correlation, it means that the proportion of explained variance in a bivariate regression is 0.25. For Kendall's tau, it means that for every discordant pair of observations, there are three concordant pairs (i.e., Prob[ concordant ] = 3 Prob[ discordant ] = 3/4 ). For Spearman rank correlation, you can only say that the variables are positively associated, but not much more. -- Stas Kolenikov, also found at http://stas.kolenikov.name Small print: I use this email account for mailing lists only. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: difference between "Spearman" and "pwcorr / correlate"***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**References**:**Re:st: difference between "Spearman" and "pwcorr / correlate"***From:*Mike Lacy <Michael.Lacy@colostate.edu>

**Re: st: difference between "Spearman" and "pwcorr / correlate"***From:*Stas Kolenikov <skolenik@gmail.com>

- Prev by Date:
**Re: AW: st: Stata Inbuilt commands** - Next by Date:
**st: What is an integer for Stata?** - Previous by thread:
**Re: st: difference between "Spearman" and "pwcorr / correlate"** - Next by thread:
**RE: st: difference between "Spearman" and "pwcorr / correlate"** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |