Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: how to force cutpoint in xtile


From   Nick Cox <njcoxstata@GMAIL.COM>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: how to force cutpoint in xtile
Date   Tue, 26 Jun 2012 00:36:21 +0100

The Stata version you are using is immaterial here.

The over-arching problem (for you) is that -xtile- will not split
observed values and that it declares a boundary when the appropriate
cumulative percents (here 20(20)80 %) have been passed. With these
data that bites as very unequal class frequencies.

What you can do, given that algorithm is negate the variable and apply
-xtile- going the other way

. input value freq

         value       freq
  1.              11          11
  2.              12           4
  3.              13          17
  4.              14          37
  5.              15           7
  6.              16          27
  7.              17          13
  8.              18           5
  9.              19          14
 10.              20          11
 11.              21          23
 12.              27          16
 13. end

. expand freq
(173 observations created)

. gen negvalue = -value

. xtile nQ5 = negvalue, nq(5)

. tab nQ5

5 quantiles |
of negvalue |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |         39       21.08       21.08
          2 |         43       23.24       44.32
          3 |         34       18.38       62.70
          4 |         37       20.00       82.70
          5 |         32       17.30      100.00
------------+-----------------------------------
      Total |        185      100.00

. xtile Q5 = value, nq(5)

. tab Q5

5 quantiles |
   of value |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |         69       37.30       37.30
          2 |          7        3.78       41.08
          3 |         40       21.62       62.70
          4 |         53       28.65       91.35
          5 |         16        8.65      100.00
------------+-----------------------------------
      Total |        185      100.00

But why are you are doing this? The data are already in a small number
of discrete values. Quintiles force 21 and 27 together, which
underlines that you are throwing away important detail.

Nick

On Mon, Jun 25, 2012 at 10:21 PM, Skiles, Martha Priedeman
<skiles@live.unc.edu> wrote:

> I've used -xtile- in Stata 11 successfully, but am having difficulty with it in Stata 12.  I have the following variable "S0D0_links" which I'd like to quintile (5 groups), but the -xtile- function is not creating groups where I would expect.  Per below, I expected the first quintile to break at 17.3 cumulative percent rather than 37.3.  Can I force the cutpoint to be either closest to my 20/40/60/80/100 quintiles or always <20/<40/<60/etc?
> I am able to force it by using "cumul" to generate a cumulative percent, and then write code using "ceil(5*cumpercent)" but I hope there's a better option.  My preference is to have the cutpoint create quintiles as close to 20/40/60/etc as possible.
>
> Thank you,
> Martha Skiles
>
> LOG:
> tab S0D0_links
>
> S0D0_links |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>          11 |         11        5.95        5.95
>          12 |          4        2.16        8.11
>          13 |         17        9.19       17.30
>          14 |         37       20.00       37.30
>          15 |          7        3.78       41.08
>          16 |         27       14.59       55.68
>          17 |         13        7.03       62.70
>          18 |          5        2.70       65.41
>          19 |         14        7.57       72.97
>          20 |         11        5.95       78.92
>          21 |         23       12.43       91.35
>          27 |         16        8.65      100.00
> ------------+-----------------------------------
>       Total |        185      100.00
>
> . xtile Q5=S0D0_links, nq(5)
>
> . tab Q5
>
> 5 quantiles |
>          of |
> S0D0_links |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           1 |         69       37.30       37.30
>           2 |          7        3.78       41.08
>           3 |         40       21.62       62.70
>           4 |         53       28.65       91.35
>           5 |         16        8.65      100.00
> ------------+-----------------------------------
>       Total |        185      100.00
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index