Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Sums and means for each decile


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: Sums and means for each decile
Date   Thu, 11 Oct 2012 18:41:47 +0100

Depends what you call programming. But you could do it with -pctile-
and -xtile-. N.B. this example revises what is done with ties at
quantile values, so that what I did tallies with what -pctile- and
-xtile- will do.

. webuse nlswork, clear
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

.  _pctile age, p(10(10)90)

.  ret li

scalars:
                 r(r1) =  21
                 r(r2) =  23
                 r(r3) =  24
                 r(r4) =  26
                 r(r5) =  28
                 r(r6) =  31
                 r(r7) =  33
                 r(r8) =  36
                 r(r9) =  38

.  gen decile = 10

.  qui forval i = 9(-1)1 {

.  tabstat age, by(decile) s(n sum mean)

Summary for variables: age
     by categories of: decile

  decile |         N       sum      mean
---------+------------------------------
       1 |      4122     81228  19.70597
       2 |      3062     68968  22.52384
       3 |      1636     39264        24
       4 |      2980     75914   25.4745
       5 |      2567     70559  27.48695
       6 |      3614    108371  29.98644
       7 |      2357     76669  32.52821
       8 |      3543    123948  34.98391
       9 |      1824     68356  37.47588
      10 |      2805    114799  40.92656
---------+------------------------------
   Total |     28510    828076  29.04511
----------------------------------------

.
.  pctile page=age, nq(10)

.  l page if page < .

       +------+
       | page |
       |------|
    1. |   21 |
    2. |   23 |
    3. |   24 |
    4. |   26 |
    5. |   28 |
       |------|
    6. |   31 |
    7. |   33 |
    8. |   36 |
    9. |   38 |
       +------+

.  xtile decile2=age, cutp(page)

.  tabstat age, by(decile2) s(n sum mean)

Summary for variables: age
     by categories of: decile2 (age categorized by page)

 decile2 |         N       sum      mean
---------+------------------------------
       1 |      4122     81228  19.70597
       2 |      3062     68968  22.52384
       3 |      1636     39264        24
       4 |      2980     75914   25.4745
       5 |      2567     70559  27.48695
       6 |      3614    108371  29.98644
       7 |      2357     76669  32.52821
       8 |      3543    123948  34.98391
       9 |      1824     68356  37.47588
      10 |      2805    114799  40.92656
---------+------------------------------
   Total |     28510    828076  29.04511
----------------------------------------

Here is the code in one:

webuse nlswork, clear

* approach 1
 _pctile age, p(10(10)90)
 ret li
 gen decile = 10 if age < .
 qui forval i = 9(-1)1 {
         replace decile = `i' if age <= r(r`i')
 }
 tabstat age, by(decile) s(n sum mean)

* approach 2
 pctile page=age, nq(10)
 l page if page < .
 xtile decile2=age, cutp(page)
 tabstat age, by(decile2) s(n sum mean)

I like the first approach because I get to choose my quantiles. If I
didn't want equally spaced quantiles, I could specify that.

Nick

On Thu, Oct 11, 2012 at 4:54 PM, Charles Vellutini
<[email protected]> wrote:
> So it does take a little bit of programming - but I agree that the use of -tabstat- with the -by()- option is very convenient.
> Many thanks, it works perfectly!
> Charles
>
> -----Message d'origine-----
> De : [email protected] [mailto:[email protected]] De la part de Nick Cox
> Envoyé : jeudi 11 octobre 2012 17:42
> À : [email protected]
> Objet : Re: st: Sums and means for each decile
>
> This would be safer
>
> gen decile = 10 if !missing(age)
>
> On Thu, Oct 11, 2012 at 4:39 PM, Nick Cox <[email protected]> wrote:
>> Here's one way to do it.  I find this more direct -- and more flexible
>> -- than what the manual seems to imply you should do, but I could
>> easily be missing something.
>>
>> webuse nlswork, clear
>> _pctile age, p(10(10)90)
>> ret li
>> gen decile = 10
>> qui forval i = 9(-1)1 {
>>         replace decile = `i' if age < r(r`i') } tabstat age,
>> by(decile) s(n sum mean)
>>
>>
>> On Thu, Oct 11, 2012 at 4:24 PM, Charles Vellutini
>> <[email protected]> wrote:
>>> Thanks Nick and sorry for the lack of clarity.
>>>
>>> -decile- does not exist, I meant -centile-, my mistake.
>>>
>>> What I want are sums and means of the same variable that was used to determine the percentiles -- but now that you mention it, it would nice to have that on other variables too!
>>>
>>> Thanks,
>>> Charles
>>>
>>> -----Message d'origine-----
>>> De : [email protected]
>>> [mailto:[email protected]] De la part de Nick Cox
>>> Envoyé : jeudi 11 octobre 2012 17:15 À :
>>> [email protected] Objet : Re: st: Sums and means for
>>> each decile
>>>
>>> What's -decile-?
>>>
>>> You're right about -pctile-. But what do you want?
>>>
>>> sums, means of some variable y in classes determined by selected percentiles of another variable x?
>>>
>>> sums, means of a variable x in classes determined by selected percentiles of the same variable x?
>>>
>>> On Thu, Oct 11, 2012 at 4:09 PM, Charles Vellutini <[email protected]> wrote:
>>>
>>>> I have looked at -decile- and -pctile- but neither provides the sums and means of each decile/percentile (only the cutoffs), if I am not mistaken.
>>>>
>>>> One could of course recover the cutoffs from -decile- and program this by hand. There is also the excellent -glcurve- available from SSC but this produces graphs; it creates variables that, yes, could also be used to compute the sums I suppose.
>>>>
>>>> But I thought that someone surely programmed this already?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index