# st: RE: Comparing medians with pweight

 From "Newson, Roger B" To Subject st: RE: Comparing medians with pweight Date Fri, 23 Mar 2007 12:12:06 -0000

```Yhe -somersd- package provides support for pweighted and cluster-sampled
data, using the infinitesimal jackknife method applied to Somers' D and
Kendall's tau-a. The non-smooth behaviour of the median is not likely to
be much of a problem, as the method of Newson (2006) uses the Normality
of Somers' D, instead of using the Normality of the median difference.
Somers' D and Kendall's tau-a are amongst the best behaved statistics
known to science, at least under the null hypothesis where the
population Somers' D or Kendall's tau-a are zero, as they are in the
equation solved in the definition of the median difference. This is
because they have "democratic" influence functions, based on the
principle of "one ordinal comparison, one vote".

I have been testing -cendif- to destruction in a simulation study, whose
results I am writing up for publication. I have found that the
infinitesimal jackknife works most of the time, at least if you use the
-tdist- option. The main limitation of the infinitesimal jackknife is
that the standard error is estimated, rather than being known a priori.
Usually, the -tdist- option is sufficient to adjust for this, as it uses
a t-distribution with n-1 degrees iof freedom, where n is the number of
clusters, or the number of observations in unclustered data. The
exceptions arise when you compare a tiny sample with a huge sample, eg
when you compare a sample of 5 with a sample of 40. Under those
conditions, the advertized 95% confidence interval may really be a 90%
confidence interval. This is probably because, under these conditions,
the influence function is not so "democratic", as the 5 units in the
smaller sample appear in as many ordinal comparisons as the 40 units in
the larger sample. Under those extreme conditions, it might possibly be
a good idea to use the percentile bootstrap, with strata defined by the
2 samples.

I hope this helps.

Best wishes

Roger

References

Newson R. Confidence intervals for rank statistics: Percentile slopes,
pre-publication draft from
http://www.imperial.ac.uk/nhli/r.newson/

Roger Newson
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton campus
Room 33, Emmanuel Kaye Building
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: r.newson@imperial.ac.uk
www.imperial.ac.uk/nhli/r.newson/

Opinions expressed are those of the author, not of the institution.

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Stas
Kolenikov
Sent: 23 March 2007 03:30
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: RE: Comparing medians with pweight

If you are dealing with survey data, and you want to deal with it
properly -- quite likely so, as things might get weird in survey
world. I have not checked the description Roger provided in SJ though,
so I really have no telling. I suspect he would intervene and let us
know :))

> Do you suggest that even following Roger's suggestion by using
> -somersd- package to compare confidence intervals for rank statistics,
> such as median,  I have to run a bootstrap to get correct comparison?

--
Stas Kolenikov

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```