# Re: st: pctile, xtile methods

 From khigbee@stata.com To statalist@hsphsun2.harvard.edu Subject Re: st: pctile, xtile methods Date Mon, 09 Jun 2003 17:50:59 -0500

```Patrick Joly <Joly.Patrick@ic.gc.ca> points out what I think is a
problem in -pctile-.  He talks about both -pctile- and -xtile-
and indicates the problem is with -xtile-.  However, -xtile-
calls through to -pctile- and so I believe addressing the problem
in -pctile- will resolve the problems he mentions.

I am looking into a solution for the problem.  Those of you not
interested in the details can stop reading.

The following code and listing as prompted by Patrick's email,
illustrates the problem.

. set obs 100
. gen nn = _n
. pctile x_p = nn, nq(100) genp(p)
. list

+-----------------+
|  nn    x_p    p |
|-----------------|
1. |   1    1.5    1 |
2. |   2    2.5    2 |
3. |   3    3.5    3 |
4. |   4    4.5    4 |
5. |   5    5.5    5 |
|-----------------|
6. |   6    6.5    6 |
7. |   7      8    7 |
8. |   8    8.5    8 |
9. |   9    9.5    9 |
10. |  10   10.5   10 |

<cut>

52. |  52   52.5   52 |
53. |  53   53.5   53 |
54. |  54   54.5   54 |
55. |  55     56   55 |
|-----------------|
56. |  56     57   56 |
57. |  57     57   57 |
58. |  58     58   58 |
59. |  59   59.5   59 |
60. |  60   60.5   60 |

<cut>

I looked at all 100 lines of output, and it appears that -pctile-
is not doing what it should for observations 7, 55, 56, 57, and
58.  The others were okay.

-pctile-, which is implemented in ado code, calls to -_pctile-,
which is internal code, to do the final work.  -_pctile- has a
limit of 20 percentiles allowed in any one call.  -pctile- makes
multiple calls into -_pctile- as needed to get all the
percentiles of interest.

Here is what -_pctile- produces for these problem observations
(and some of those around it).

. _pctile nn , p(5 6 7 54 55 56 57 58 59)
. ret list

scalars:
r(r1) =  5.5
r(r2) =  6.5
r(r3) =  7.5
r(r4) =  54.5
r(r5) =  55.5
r(r6) =  56.5
r(r7) =  57.5
r(r8) =  58.5
r(r9) =  59.5

Notice that -_pctile- gets what we expect.  The problem is not
with -_pctile-.

I looked at pctile.ado and I know what the problem is.  The lines

local pct = 100*`i'/`nquanti'
if "`plist'"=="" { local plist "`pct'" }
else               local plist "`plist',`pct'"

are building up a list of up to 20 percentiles that will then be
passed into the -percentiles()- option of -_pctile-.

The local macro that is holding the list does not contain double
precision numbers.  Instead of calling something like

_pctile nn , percentiles(7)

pctile ends up calling something like

_pctile nn , percentiles(7.000000000000001)

My first couple of attempts at a patch for this problem have not
been successful.  I tried

tempname prct
...

scalar `prct' = 100*`i'/`nquanti'
if "`plist'"=="" { local plist "`prct'" }
else               local plist "`plist',`prct'"

and also

tempname prct
...

scalar `prct' = 100*`i'/`nquanti'
local pct : di %22.20f `prct'
if "`plist'"=="" { local plist "`pct'" }
else               local plist "`plist',`pct'"

etc., but it did not solve the problem.  I will investigate other
solutions.  When I find one, I will apply it in an update to
Stata.

Ken Higbee    khigbee@stata.com
StataCorp     1-800-STATAPC

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```