Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: difference in medians . Raw vs calculated


From   "Roger B. Newson" <r.newson@imperial.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: difference in medians . Raw vs calculated
Date   Sun, 20 Jan 2013 16:25:48 +0000

In general, the Hodges-Lehmann median difference is NOT the difference between medians. However, they ARE the same if EITHER the 2 subpopulation distributions are both symmetric OR they are different only in location. I have written some (as yet unrefereed) articles on my website on this subject. See Newson (2008) and Newson (2009).

Best wishes

Roger

References

Newson RB. 2009. Asymptotic distributions of two-sample rank statistics for continuous outcomes. Download from
http://www.imperial.ac.uk/nhli/r.newson/papers.htm#miscellaneous_documents

Newson RB. 2008. Hodges{Lehmann median differences between exponential
subpopulations. Download from
http://www.imperial.ac.uk/nhli/r.newson/papers.htm#miscellaneous_documents


Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: r.newson@imperial.ac.uk
Web page: http://www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/

Opinions expressed are those of the author, not of the institution.

On 20/01/2013 01:00, Richard Hiscock wrote:
I wish to derive 95%CI for difference in medians and noticed that difference in raw median values between groups didn't equal that calculated using packages cendif (R.Newson) and cid (P.Royston) Clearly Im missing something and would be grateful for an explanation.

I suspect it relates to a transformation performed prior to calculation of the difference & subsequent back transformation to original units.

However it is hard to present raw unit median values and the the difference in medians (& CI) which are not the same. In my data set (plasma protein assay) the raw difference in medians is 0.5 whereas the difference calculated by cid or cendif is 0.33 making it hard to explain to readers.

Thanks for any advice



Illustrated using the auto data set:



Use auto

tabstat weight, by(foreign) stats(p50)



Summary for variables: weight by categories of: foreign (Car type)



foreign |       p50

---------+----------

Domestic |      3360

Foreign |      2180

---------+----------

   Total |      3190

--------------------



*difference = 1180





. cendif weight, by(foreign)

Y-variable: weight (Weight (lbs.))

Grouped by: foreign (Car type)

Group numbers:



   Car type |      Freq.     Percent        Cum.

------------+-----------------------------------

   Domestic |         52       70.27       70.27

    Foreign |         22       29.73      100.00

------------+-----------------------------------

      Total |         74      100.00

Transformation: Fisher's z

95% confidence interval(s) for percentile difference(s)

between values of weight in first and second groups:

   Percent    Pctl_Dif     Minimum     Maximum

        50        1095         750        1330



. cid weight,by(foreign) unpaired



Normal-based confidence interval for difference in  means by foreign



Variable |     Obs     Estimate    Std. Err.       [95% Conf. Interval]

---------+-------------------------------------------------------------

  weight |      74     1001.206    160.2876        681.6788    1320.734



. qreg weight foreign

Iteration  1:  WLS sum of weighted deviations =  34840.693



Iteration  1: sum of abs. weighted deviations =      34860

note:  alternate solutions exist

Iteration  2: sum of abs. weighted deviations =      34620

note:  alternate solutions exist

Iteration  3: sum of abs. weighted deviations =      34580



Median regression                                    Number of obs =        74

  Raw sum of deviations    48860 (about 3180)

  Min sum of deviations    34580                     Pseudo R2     =    0.2923



------------------------------------------------------------------------------

      weight |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     foreign |      -1150   223.2969    -5.15   0.000    -1595.134   -704.8659

       _cons |       3350   121.7526    27.51   0.000     3107.291    3592.709

------------------------------------------------------------------------------



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index