# Re: st: Mann-whitney U test

 From Roger Newson <[email protected]> To [email protected] Subject Re: st: Mann-whitney U test Date Sun, 23 Jan 2005 14:33:10 +0000

```At 01:25 23/01/2005, Ricardo wrote:
```
```Thank Roger. I am familiar with this program and I
have used it before. So the test really test both
hypotheses: that the difference between the median is
zero, and that the degree of non-overlap of the two
populations is zero. i.e. whether the degree of
overlap between the two populations is significantly
different than would be expected by chance alone. Is
this correct?
```
No and yes. The Wilcoxon ranksum test does indeed test the hypopthesis that Somers' D is zero, where Somers' D is the difference between 2 probabilities, namely the probability that a randomly-chosen member of Subpopulation A has a higher outcome value than a randomly-chosen member of Subpopulation B and the probability that a randomly-chosen member of Subpopulation B has a higher outcome value than a randomly-chosen member of Subpopulation A. If these 2 probabilities are equal, then you can argue that (in Ricardo's words) "the degree of non-overlap of the two populations is zero". However, the Hodges-Lehmann median difference is not always the difference between the 2 subpopulation medians. The Hodges-Lehmann median difference is the median difference between 2 outcome values, assuming that the first is sampled at random from Subpopulation A and the second is sampled at random from Subpopulation B.

If the 2 sub-population distributions are different only in location, then the Hodges-Lehmann median difference is indeed the difference between the 2 subpopulation medians, because then the difference between 2 outcome values sampled independently from the 2 subpopulations is distributed symmetrically around the location difference, and the median difference is the mean difference, which is the difference between means, which is the difference between medians. However, the 2 subpopulations may differ in ways other than location, and then the difference between the 2 medians may be different from the Hodges-Lehmann median difference. I often get queries from users of my program -cendif- (part of the -somersd- package) asking why, in their data, the Hodges-Lehmann median difference is not the difference between the 2 medians.

I hope this helps.

Best wishes

Roger

--
Roger Newson
Lecturer in Medical Statistics
Department of Public Health Sciences
King's College London
5th Floor, Capital House
42 Weston Street
London SE1 3QD
United Kingdom

Tel: 020 7848 6648 International +44 20 7848 6648
Fax: 020 7848 6620 International +44 20 7848 6620
or 020 7848 6605 International +44 20 7848 6605
Email: [email protected]
Website: http://phs.kcl.ac.uk/rogernewson/

Opinions expressed are those of the author, not the institution.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/