# st: Quantile question

 From "Dedman, Dan" To Subject st: Quantile question Date Fri, 29 Feb 2008 15:40:35 -0000

```We want to agree on a method for producing quantiles so we are all
working to the same algorithm. I was intrigued by the way Stata does it
and wondered where this came from and what the justification is.

If we have 150 observations to be grouped into quintiles - this is easy.
But what if we had 151, or 152, 153 or 154 observations?

This is how Stata 9 does it using -xtile- :

xtile newvar=rank, nquantiles(5)

----------------------------
q1	31	31	31	31
q2	30	30	31	31
q3	30	31	30	31
q4	30	30	31	31
q5	30	30	30	30
----------------------------
All	151	152	153	154

and using the -cut- function from -egen- :

egen q2=cut(rank), group(5)

----------------------------
q0	30	30	30	30
q1	30	30	31	31
q2	30	31	30	31
q3	30	30	31	31
q4	31	31	31	31
----------------------------
All	151	152	153	154

So the two methods work in opposite directions, but are otherwise
consistent in where they place the 'extra' 1 to 4 observations.

I am quite to adopt the Stata approach, but some of my colleagues do not
use Stata, so I would like to describe how the Stata algorithm works,
and why Stata does it this this way as opposed to any other way. Is this
a general convention, or more easy to justify statistically or
otherwise, or just a case of find a way that works and stick with it.

Many thanks

Daniel Dedman
Public Health Information Analyst/Project Manager
North West Public Health Observatory

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```