# RE: st: Mann-whitney U test

 From "Nick Cox" <[email protected]> To <[email protected]> Subject RE: st: Mann-whitney U test Date Sun, 23 Jan 2005 15:46:09 -0000

```In addition to Roger's routines
and remarks, note that the -ranksum-
command has a fairly recently

To underline the view that
measuring the magnitude of
something is often much more
interesting and useful than testing
a null hypothesis, I would argue
for making it part of the default
output.

Nick
[email protected]

Roger Newson

> At 01:25 23/01/2005, Ricardo wrote:
> >Thank Roger. I am familiar with this program and I
> >have used it before. So the test really test both
> >hypotheses: that the difference between the median is
> >zero, and that the degree of non-overlap of the two
> >populations is zero. i.e. whether the degree of
> >overlap between the two populations is significantly
> >different than would be expected by chance alone. Is
> >this correct?
>
> No and yes. The Wilcoxon ranksum test does indeed test the
> hypopthesis that
> Somers' D is zero, where Somers' D is the difference between 2
> probabilities, namely the probability that a randomly-chosen
> member of
> Subpopulation A has a higher outcome value than a
> randomly-chosen member of
> Subpopulation B and the probability that a randomly-chosen member of
> Subpopulation B has a higher outcome value than a
> randomly-chosen member of
> Subpopulation A. If these 2 probabilities are equal, then you
> can argue
> that (in Ricardo's words) "the degree of non-overlap of the
> two populations
> is zero". However, the Hodges-Lehmann median difference is
> not always the
> difference between the 2 subpopulation medians. The
> Hodges-Lehmann median
> difference is the median difference between 2 outcome values,
> assuming that
> the first is sampled at random from Subpopulation A and the second is
> sampled at random from Subpopulation B.
>
> If the 2 sub-population distributions are different only in
> location, then
> the Hodges-Lehmann median difference is indeed the difference
> between the 2
> subpopulation medians, because then the difference between 2
> outcome values
> sampled independently from the 2 subpopulations is distributed
> symmetrically around the location difference, and the median
> difference is
> the mean difference, which is the difference between means,
> which is the
> difference between medians. However, the 2 subpopulations may
> differ in
> ways other than location, and then the difference between the
> 2 medians may
> be different from the Hodges-Lehmann median difference. I
> often get queries
> from users of my program -cendif- (part of the -somersd-
> why, in their data, the Hodges-Lehmann median difference is not the
> difference between the 2 medians.
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```