Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: RE: RE: Why does -centile- provide different results compared to -sum- or -pctile-?


From   "Martin Weiss" <martin.weiss1@gmx.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: RE: Why does -centile- provide different results compared to -sum- or -pctile-?
Date   Fri, 16 Jul 2010 21:13:24 +0200

<>

This example highlights the differences btw the commands:


***********
clear*
set obs  100
gen byte X=_n
su X, d
centile X, c(10)
_pctile X, p(10)
di in r r(r1)
qreg X, q(10)
***********


HTH
Martin


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Lachenbruch,
Peter
Sent: Freitag, 16. Juli 2010 20:46
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: RE: Why does -centile- provide different results compared
to -sum- or -pctile-?

The reason for this is that summarize uses single observations (or means of
adjacent ones) while centile uses linear combinations of order statistics, i
think.  Certanly, qreg does this.

________________________________________
From: owner-statalist@hsphsun2.harvard.edu
[owner-statalist@hsphsun2.harvard.edu] On Behalf Of Martin Weiss
[martin.weiss1@gmx.de]
Sent: Friday, July 16, 2010 11:39 AM
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: Why does -centile- provide different results compared to
-sum- or      -pctile-?

<>

Example 1 in the manual entry for -centile- acknowledges the difference btw
-summ- and -centile- quite openly: "summarize produces somewhat different
results from centile"


HTH
Martin

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Tiago V. Pereira
Sent: Freitag, 16. Juli 2010 20:33
To: statalist@hsphsun2.harvard.edu
Subject: st: Why does -centile- provide different results compared to -sum-
or -pctile-?

Dear statalister,

Am I misinterpreting the outpus or there is something wrong with
-centile-? For example, the values for percentile 10 differ in -centile-
(89.666) compard to -sum- and -pctile- (90.66666).

Cheers!

Tiago



(Stata 9 and 10.1 provide identical results).

*/ -------- start do ---------
clear
input X
223
104.3333
198.3333
126
95
101.3333
133
90.66666
38
91
101
36.5
101
203
92.5
149
142.3333
89.33334
105.6667
123.3333
136
240.6667
162.3333
98.66666
181.6667
261
206
251
339.3333
219.3333
114
end

sum X, detail

centile X , centile(1 5 10 25 50 75 90 95 99)

_pctile X, nq(10)

dis "P10 = " r(r1)
dis "P90 = " r(r9)

*/ ---------end do -----------------

My outputs:


 sum X, detail

                              X
-------------------------------------------------------------
      Percentiles      Smallest
 1%         36.5           36.5
 5%           38             38
10%     90.66666       89.33334       Obs                  31
25%     98.66666       90.66666       Sum of Wgt.          31

50%          126                      Mean            146.914
                        Largest       Std. Dev.       69.5917
75%          203       240.6667
90%     240.6667            251       Variance       4843.005
95%          261            261       Skewness       .7895009
99%     339.3333       339.3333       Kurtosis       3.214935

.
. centile X , centile(1 5 10 25 50 75 90 95 99)

                                                       -- Binom. Interp. --
    Variable |     Obs  Percentile      Centile        [95% Conf. Interval]
-------------+-------------------------------------------------------------
           X |      31          1          36.5            36.5    57.78294*
             |                  5          37.4            36.5    90.95187*
             |                 10      89.60001            36.5    95.97249*
             |                 25      98.66666        90.32367     107.523
             |                 50           126        101.1658     172.048
             |                 75           203        140.9225    243.3249
             |                 90      248.9333        205.2043    339.3333*
             |                 95      292.3333        225.5506    339.3333*
             |                 99      339.3333        257.1462    339.3333*

* Lower (upper) confidence limit held at minimum (maximum) of sample

.
. _pctile X, nq(10)

.
. dis "P10 = " r(r1)
P10 = 90.666656

. dis "P90 = " r(r9)
P90 = 240.6667


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index