Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: how to force cutpoint in xtile


From   "Skiles, Martha Priedeman" <[email protected]>
To   "[email protected]" <[email protected]>
Subject   RE: st: how to force cutpoint in xtile
Date   Tue, 26 Jun 2012 03:10:04 +0000

Thank you Nick.  
I was hoping that the xtile command would set the cutpoint at the closest break to the 20% rather than pass the 20% and then choose the closest.  It sounds like that is not an option, rather I need to choose between quintiling as-is or reversing the order.

Per your question about why I'd want to quintile, this is just a very small part of my output that I need in quintiles in  order to compare relative (rather than absolute) positions.  The "value" itself has no readily interpretable meaning, rather it is more helpful to think about relative groups and how that classification of quintile changes from one data run to another.

I appreciate your taking the time to respond.

Regards,
Martha

________________________________________
From: [email protected] [[email protected]] on behalf of Nick Cox [[email protected]]
Sent: Monday, June 25, 2012 7:36 PM
To: [email protected]
Subject: Re: st: how to force cutpoint in xtile

The Stata version you are using is immaterial here.

The over-arching problem (for you) is that -xtile- will not split
observed values and that it declares a boundary when the appropriate
cumulative percents (here 20(20)80 %) have been passed. With these
data that bites as very unequal class frequencies.

What you can do, given that algorithm is negate the variable and apply
-xtile- going the other way

. input value freq

         value       freq
  1.              11          11
  2.              12           4
  3.              13          17
  4.              14          37
  5.              15           7
  6.              16          27
  7.              17          13
  8.              18           5
  9.              19          14
 10.              20          11
 11.              21          23
 12.              27          16
 13. end

. expand freq
(173 observations created)

. gen negvalue = -value

. xtile nQ5 = negvalue, nq(5)

. tab nQ5

5 quantiles |
of negvalue |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |         39       21.08       21.08
          2 |         43       23.24       44.32
          3 |         34       18.38       62.70
          4 |         37       20.00       82.70
          5 |         32       17.30      100.00
------------+-----------------------------------
      Total |        185      100.00

. xtile Q5 = value, nq(5)

. tab Q5

5 quantiles |
   of value |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |         69       37.30       37.30
          2 |          7        3.78       41.08
          3 |         40       21.62       62.70
          4 |         53       28.65       91.35
          5 |         16        8.65      100.00
------------+-----------------------------------
      Total |        185      100.00

But why are you are doing this? The data are already in a small number
of discrete values. Quintiles force 21 and 27 together, which
underlines that you are throwing away important detail.

Nick

On Mon, Jun 25, 2012 at 10:21 PM, Skiles, Martha Priedeman
<[email protected]> wrote:

> I've used -xtile- in Stata 11 successfully, but am having difficulty with it in Stata 12.  I have the following variable "S0D0_links" which I'd like to quintile (5 groups), but the -xtile- function is not creating groups where I would expect.  Per below, I expected the first quintile to break at 17.3 cumulative percent rather than 37.3.  Can I force the cutpoint to be either closest to my 20/40/60/80/100 quintiles or always <20/<40/<60/etc?
> I am able to force it by using "cumul" to generate a cumulative percent, and then write code using "ceil(5*cumpercent)" but I hope there's a better option.  My preference is to have the cutpoint create quintiles as close to 20/40/60/etc as possible.
>
> Thank you,
> Martha Skiles
>
> LOG:
> tab S0D0_links
>
> S0D0_links |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>          11 |         11        5.95        5.95
>          12 |          4        2.16        8.11
>          13 |         17        9.19       17.30
>          14 |         37       20.00       37.30
>          15 |          7        3.78       41.08
>          16 |         27       14.59       55.68
>          17 |         13        7.03       62.70
>          18 |          5        2.70       65.41
>          19 |         14        7.57       72.97
>          20 |         11        5.95       78.92
>          21 |         23       12.43       91.35
>          27 |         16        8.65      100.00
> ------------+-----------------------------------
>       Total |        185      100.00
>
> . xtile Q5=S0D0_links, nq(5)
>
> . tab Q5
>
> 5 quantiles |
>          of |
> S0D0_links |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           1 |         69       37.30       37.30
>           2 |          7        3.78       41.08
>           3 |         40       21.62       62.70
>           4 |         53       28.65       91.35
>           5 |         16        8.65      100.00
> ------------+-----------------------------------
>       Total |        185      100.00
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index