Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Spearman correlations with survey data

From   Stas Kolenikov <>
Subject   Re: st: Spearman correlations with survey data
Date   Thu, 8 Nov 2012 12:20:02 -0600

To add to Roger's response: as a survey statistician, I want to work
with the stuff that has clear finite population definitions, and clear
sample analogues that would estimate the population target. I do
understand what the population parameter for Kendall's tau is, and
hence how to get to that parameter with approximate survey inference,
as indicated by Roger's -somersd- line of code. However, I don't
understand what the population parameter is for Spearman correlation,
so it is not at all clear what it is going to be that -svy : spearman-
would do if it worked (thankfully, it does not). Conceptually, one can
say that there is the census Spearman correlation for the finite
population, but whether the sample Spearman correlation is any good in
estimating it, I don't know (and doubt that); likewise, I don't know
what a weighted rank is, or should be. With a long stretch of finite
population inference imagination, you can argue that Spearman
correlation is a U-statistic of order 2, so estimating it properly
requires the second order selection probabilities, and estimating its
variance, the fourth order selection probabilities. That's a
long-winded way to say, "this is not doable".

-- Stas Kolenikov, PhD, PStat (SSC)  ::
-- Senior Survey Statistician, Abt SRBI  ::  work email kolenikovs at
srbi dot com
-- Opinions stated in this email are mine only, and do not reflect the
position of my employer

On Thu, Nov 8, 2012 at 12:00 PM, Roger B. Newson
<> wrote:
> I would second that "Amen". And I would recommend the use of Kendall's tau-a
> as an alternative, because (a) the Central Limit Theorem works a lot faster
> for Kendall's tau-a than for Spearman's rho and (b) Kendall's tau-a is
> interpretable (in words) as a difference between 2 probabilities, namely a
> probability of concordance and a probability of discordance. I find this
> difference between probabilities easier to understand than a measure of rank
> linearity (which is what Spearman's rho is), and also more useful to know,
> as I would prefer a good non-linear monotonic predictor to a second-rate
> linear predictor. Spearman's rho, by contrast, is MUCH easier to estimate
> without a computer, which was an important issue before we had computers.
> To estimate Kendall's tau-a with confidence limits in Stata with clustering
> and/or sampling-probability weighting, use the -somersd- package,
> downloadable from SSC. As in:
> somersd x y [pwei=mysampwt], taua tdist transf(z) cluster(mypsu)
> I have not yet introduced confidence intervals allowing for sampling strata.
> However, sampling-probability weights and clustering are available. And the
> -somersd- package comes with .pdf manuals, and with hypertext references  in
> the on-line help to further documentation of the methiods and formulas.
> I hope this helps.
> Best wishes
> Roger
> Roger B Newson BSc MSc DPhil
> Lecturer in Medical Statistics
> Respiratory Epidemiology and Public Health Group
> National Heart and Lung Institute
> Imperial College London
> Royal Brompton Campus
> Room 33, Emmanuel Kaye Building
> 1B Manresa Road
> London SW3 6LR
> Tel: +44 (0)20 7352 8121 ext 3381
> Fax: +44 (0)20 7351 8322
> Email:
> Web page:
> Departmental Web page:
> Opinions expressed are those of the author, not of the institution.
> On 08/11/2012 17:26, Lachenbruch, Peter wrote:
>> Amen!  In fact, tests on Spearman coefficients are notoriously sensitive
>> to normality. An article by Egon Pearson in Biometrika in the 1970s showed
>> this clearly.  Sorry i don't have the reference at hand.
>> Peter A. Lachenbruch,
>> Professor (retired)
>> ________________________________________
>> From:
>> [] on behalf of Nick Cox
>> []
>> Sent: Thursday, November 08, 2012 1:58 AM
>> To:
>> Subject: Re: st: Spearman correlations with survey data
>> Spearman correlation is just Pearson correlation applied to ranks, so
>> ranking first (use -egen-) gets you from one to the other. Otherwise
>> P-values for correlations are over-rated in my view, whether in -svy-
>> contexts or otherwise.
>> Others should have comments on the -svy- aspects.
>> Nick
>> On Thu, Nov 8, 2012 at 5:42 AM, Lee Grenon <> wrote:
>>> I am interested in calculating Spearman correlations for complex survey
>>> data. As I understand, I can calculate Pearson correlations using corr with
>>> aweight for the coefficients and then calculate the p-values using svy:
>>> regress y x and svy: regress x y then selecting the larger p-value. Is there
>>> a way of calculating Spearman correlations using a survey weight and
>>> bootstrap weights?
>> *
>> *   For searches and help try:
>> *
>> *
>> *
>> *
>> *   For searches and help try:
>> *
>> *
>> *
> *
> *   For searches and help try:
> *
> *
> *
*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index