Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# RE: st: how to force cutpoint in xtile

 From "Skiles, Martha Priedeman" <[email protected]> To "[email protected]" <[email protected]> Subject RE: st: how to force cutpoint in xtile Date Tue, 26 Jun 2012 03:10:04 +0000

```Thank you Nick.
I was hoping that the xtile command would set the cutpoint at the closest break to the 20% rather than pass the 20% and then choose the closest.  It sounds like that is not an option, rather I need to choose between quintiling as-is or reversing the order.

Per your question about why I'd want to quintile, this is just a very small part of my output that I need in quintiles in  order to compare relative (rather than absolute) positions.  The "value" itself has no readily interpretable meaning, rather it is more helpful to think about relative groups and how that classification of quintile changes from one data run to another.

I appreciate your taking the time to respond.

Regards,
Martha

________________________________________
From: [email protected] [[email protected]] on behalf of Nick Cox [[email protected]]
Sent: Monday, June 25, 2012 7:36 PM
To: [email protected]
Subject: Re: st: how to force cutpoint in xtile

The Stata version you are using is immaterial here.

The over-arching problem (for you) is that -xtile- will not split
observed values and that it declares a boundary when the appropriate
cumulative percents (here 20(20)80 %) have been passed. With these
data that bites as very unequal class frequencies.

What you can do, given that algorithm is negate the variable and apply
-xtile- going the other way

. input value freq

value       freq
1.              11          11
2.              12           4
3.              13          17
4.              14          37
5.              15           7
6.              16          27
7.              17          13
8.              18           5
9.              19          14
10.              20          11
11.              21          23
12.              27          16
13. end

. expand freq
(173 observations created)

. gen negvalue = -value

. xtile nQ5 = negvalue, nq(5)

. tab nQ5

5 quantiles |
of negvalue |      Freq.     Percent        Cum.
------------+-----------------------------------
1 |         39       21.08       21.08
2 |         43       23.24       44.32
3 |         34       18.38       62.70
4 |         37       20.00       82.70
5 |         32       17.30      100.00
------------+-----------------------------------
Total |        185      100.00

. xtile Q5 = value, nq(5)

. tab Q5

5 quantiles |
of value |      Freq.     Percent        Cum.
------------+-----------------------------------
1 |         69       37.30       37.30
2 |          7        3.78       41.08
3 |         40       21.62       62.70
4 |         53       28.65       91.35
5 |         16        8.65      100.00
------------+-----------------------------------
Total |        185      100.00

But why are you are doing this? The data are already in a small number
of discrete values. Quintiles force 21 and 27 together, which
underlines that you are throwing away important detail.

Nick

On Mon, Jun 25, 2012 at 10:21 PM, Skiles, Martha Priedeman
<[email protected]> wrote:

> I've used -xtile- in Stata 11 successfully, but am having difficulty with it in Stata 12.  I have the following variable "S0D0_links" which I'd like to quintile (5 groups), but the -xtile- function is not creating groups where I would expect.  Per below, I expected the first quintile to break at 17.3 cumulative percent rather than 37.3.  Can I force the cutpoint to be either closest to my 20/40/60/80/100 quintiles or always <20/<40/<60/etc?
> I am able to force it by using "cumul" to generate a cumulative percent, and then write code using "ceil(5*cumpercent)" but I hope there's a better option.  My preference is to have the cutpoint create quintiles as close to 20/40/60/etc as possible.
>
> Thank you,
> Martha Skiles
>
> LOG:
>
> S0D0_links |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>          11 |         11        5.95        5.95
>          12 |          4        2.16        8.11
>          13 |         17        9.19       17.30
>          14 |         37       20.00       37.30
>          15 |          7        3.78       41.08
>          16 |         27       14.59       55.68
>          17 |         13        7.03       62.70
>          18 |          5        2.70       65.41
>          19 |         14        7.57       72.97
>          20 |         11        5.95       78.92
>          21 |         23       12.43       91.35
>          27 |         16        8.65      100.00
> ------------+-----------------------------------
>       Total |        185      100.00
>
>
> . tab Q5
>
> 5 quantiles |
>          of |
> S0D0_links |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           1 |         69       37.30       37.30
>           2 |          7        3.78       41.08
>           3 |         40       21.62       62.70
>           4 |         53       28.65       91.35
>           5 |         16        8.65      100.00
> ------------+-----------------------------------
>       Total |        185      100.00
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```