At 11:24 30/08/02 +0200, Jens Lauritsen wrote:
Does anyone have good references for INTERPRETATION in relation to size of
the spearman correlation coefficient.
The population Spearman correlation coefficient between X and Y is, by
definition, the Pearson product-moment correlation between the cumulative
distribution functions (CDFs) F_X(X) and F_Y(Y), where F_X(z) is the
population CDF of X (ie Pr(X<=z)) and F_Y(z) is the population CDF of Y (ie
Pr(Y<=z)). It is estimated by the sample Spearman correlation coefficient,
ie the product-moment correlation between the sample ranks. Confidence
intervals (CIs) around the sample Spearman correlation for the population
Spearman correlation can be derived by the jackknife or bootstrap.
It is not easy (I think) to define, in plain language, an interpretation of
the product-moment correlation between two CDFs. Most people, most of the
time, think of the Spearman correlation only as a measure of positive or
negative association on a scale from -1 to 1. This difficulty of
interpretation is the reason why people like me prefer Kendall's tau-a,
which is the difference between two probabilities, namely the probability
of concordance and the probability of discordance. (If I am double-marking
exam scripts with a colleague, and the Kendall's tau-a between our marks is
0.70, then this means that, given 2 exam scripts and asked which is the
best, we are 70% more likely to agree than to disagree.) I wrote an article
about Kendall's tau-a and its interpretation, in The Stata Journal
(Newson,2002).
However, Spearman's correlation (as well as Kendall's) can be interpreted
by assuming that X and Y are derived, by a pair of monotonic
transformations, from two variables V=g(X) and W=h(Y) with a joint
multivariate normal distribution. For Normal variables, the Pearson
correlation coefficient is related to the Spearman and Kendall coefficients
by the equations
rho=sin((pi/2)*tau)=2*sin((pi/6)*rho_s)
where rho is the Pearson correlation, rho_s is the Spearman correlation and
tau is the Kendall correlation. (See Kendall, 1949.) As the Spearman and
Kendall correlations are preserved by monotonic transformations such as
g(.) and h(.), they are the same between X and Y as between V and W.
Therefore, if you define a confidence interval for tau or rho_s between X
and Y, you can transform that confidence interval (using the above
equations) to get an outlier-resistant confidence interval for the Pearson
correlation between V and W, without even having to know the form of the
transformations g(.) and h(.).
I hope this helps.
Roger
References
Kendall MG. Rank and product-moment correlation. Biometrika 1949; 36: 177-193.
Newson R. Parameters behind "nonparametric" statistics: Kendall's tau,
Somers' D and median differences. The Stata Journal 2002; 2(1): 45-64.
--
Roger Newson
Lecturer in Medical Statistics
Department of Public Health Sciences
King's College London
5th Floor, Capital House
42 Weston Street
London SE1 3QD
United Kingdom
Tel: 020 7848 6648 International +44 20 7848 6648
Fax: 020 7848 6620 International +44 20 7848 6620
or 020 7848 6605 International +44 20 7848 6605
Email: [email protected]
Opinions expressed are those of the author, not the institution.