Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: how to force cutpoint in xtile

 From Nick Cox To statalist@hsphsun2.harvard.edu Subject Re: st: how to force cutpoint in xtile Date Tue, 26 Jun 2012 00:36:21 +0100

```The Stata version you are using is immaterial here.

The over-arching problem (for you) is that -xtile- will not split
observed values and that it declares a boundary when the appropriate
cumulative percents (here 20(20)80 %) have been passed. With these
data that bites as very unequal class frequencies.

What you can do, given that algorithm is negate the variable and apply
-xtile- going the other way

. input value freq

value       freq
1.              11          11
2.              12           4
3.              13          17
4.              14          37
5.              15           7
6.              16          27
7.              17          13
8.              18           5
9.              19          14
10.              20          11
11.              21          23
12.              27          16
13. end

. expand freq
(173 observations created)

. gen negvalue = -value

. xtile nQ5 = negvalue, nq(5)

. tab nQ5

5 quantiles |
of negvalue |      Freq.     Percent        Cum.
------------+-----------------------------------
1 |         39       21.08       21.08
2 |         43       23.24       44.32
3 |         34       18.38       62.70
4 |         37       20.00       82.70
5 |         32       17.30      100.00
------------+-----------------------------------
Total |        185      100.00

. xtile Q5 = value, nq(5)

. tab Q5

5 quantiles |
of value |      Freq.     Percent        Cum.
------------+-----------------------------------
1 |         69       37.30       37.30
2 |          7        3.78       41.08
3 |         40       21.62       62.70
4 |         53       28.65       91.35
5 |         16        8.65      100.00
------------+-----------------------------------
Total |        185      100.00

But why are you are doing this? The data are already in a small number
of discrete values. Quintiles force 21 and 27 together, which
underlines that you are throwing away important detail.

Nick

On Mon, Jun 25, 2012 at 10:21 PM, Skiles, Martha Priedeman
<skiles@live.unc.edu> wrote:

> I've used -xtile- in Stata 11 successfully, but am having difficulty with it in Stata 12.  I have the following variable "S0D0_links" which I'd like to quintile (5 groups), but the -xtile- function is not creating groups where I would expect.  Per below, I expected the first quintile to break at 17.3 cumulative percent rather than 37.3.  Can I force the cutpoint to be either closest to my 20/40/60/80/100 quintiles or always <20/<40/<60/etc?
> I am able to force it by using "cumul" to generate a cumulative percent, and then write code using "ceil(5*cumpercent)" but I hope there's a better option.  My preference is to have the cutpoint create quintiles as close to 20/40/60/etc as possible.
>
> Thank you,
> Martha Skiles
>
> LOG:
>
> S0D0_links |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>          11 |         11        5.95        5.95
>          12 |          4        2.16        8.11
>          13 |         17        9.19       17.30
>          14 |         37       20.00       37.30
>          15 |          7        3.78       41.08
>          16 |         27       14.59       55.68
>          17 |         13        7.03       62.70
>          18 |          5        2.70       65.41
>          19 |         14        7.57       72.97
>          20 |         11        5.95       78.92
>          21 |         23       12.43       91.35
>          27 |         16        8.65      100.00
> ------------+-----------------------------------
>       Total |        185      100.00
>
>
> . tab Q5
>
> 5 quantiles |
>          of |
> S0D0_links |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           1 |         69       37.30       37.30
>           2 |          7        3.78       41.08
>           3 |         40       21.62       62.70
>           4 |         53       28.65       91.35
>           5 |         16        8.65      100.00
> ------------+-----------------------------------
>       Total |        185      100.00
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```