Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Sums and means for each decile

 From Nick Cox To statalist@hsphsun2.harvard.edu Subject Re: st: Sums and means for each decile Date Thu, 11 Oct 2012 18:41:47 +0100

```Depends what you call programming. But you could do it with -pctile-
and -xtile-. N.B. this example revises what is done with ties at
quantile values, so that what I did tallies with what -pctile- and
-xtile- will do.

. webuse nlswork, clear
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

.  _pctile age, p(10(10)90)

.  ret li

scalars:
r(r1) =  21
r(r2) =  23
r(r3) =  24
r(r4) =  26
r(r5) =  28
r(r6) =  31
r(r7) =  33
r(r8) =  36
r(r9) =  38

.  gen decile = 10

.  qui forval i = 9(-1)1 {

.  tabstat age, by(decile) s(n sum mean)

Summary for variables: age
by categories of: decile

decile |         N       sum      mean
---------+------------------------------
1 |      4122     81228  19.70597
2 |      3062     68968  22.52384
3 |      1636     39264        24
4 |      2980     75914   25.4745
5 |      2567     70559  27.48695
6 |      3614    108371  29.98644
7 |      2357     76669  32.52821
8 |      3543    123948  34.98391
9 |      1824     68356  37.47588
10 |      2805    114799  40.92656
---------+------------------------------
Total |     28510    828076  29.04511
----------------------------------------

.
.  pctile page=age, nq(10)

.  l page if page < .

+------+
| page |
|------|
1. |   21 |
2. |   23 |
3. |   24 |
4. |   26 |
5. |   28 |
|------|
6. |   31 |
7. |   33 |
8. |   36 |
9. |   38 |
+------+

.  xtile decile2=age, cutp(page)

.  tabstat age, by(decile2) s(n sum mean)

Summary for variables: age
by categories of: decile2 (age categorized by page)

decile2 |         N       sum      mean
---------+------------------------------
1 |      4122     81228  19.70597
2 |      3062     68968  22.52384
3 |      1636     39264        24
4 |      2980     75914   25.4745
5 |      2567     70559  27.48695
6 |      3614    108371  29.98644
7 |      2357     76669  32.52821
8 |      3543    123948  34.98391
9 |      1824     68356  37.47588
10 |      2805    114799  40.92656
---------+------------------------------
Total |     28510    828076  29.04511
----------------------------------------

Here is the code in one:

webuse nlswork, clear

* approach 1
_pctile age, p(10(10)90)
ret li
gen decile = 10 if age < .
qui forval i = 9(-1)1 {
replace decile = `i' if age <= r(r`i')
}
tabstat age, by(decile) s(n sum mean)

* approach 2
pctile page=age, nq(10)
l page if page < .
xtile decile2=age, cutp(page)
tabstat age, by(decile2) s(n sum mean)

I like the first approach because I get to choose my quantiles. If I
didn't want equally spaced quantiles, I could specify that.

Nick

On Thu, Oct 11, 2012 at 4:54 PM, Charles Vellutini
<charles.vellutini@ecopa.com> wrote:
> So it does take a little bit of programming - but I agree that the use of -tabstat- with the -by()- option is very convenient.
> Many thanks, it works perfectly!
> Charles
>
> -----Message d'origine-----
> De : owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] De la part de Nick Cox
> Envoyé : jeudi 11 octobre 2012 17:42
> À : statalist@hsphsun2.harvard.edu
> Objet : Re: st: Sums and means for each decile
>
> This would be safer
>
> gen decile = 10 if !missing(age)
>
> On Thu, Oct 11, 2012 at 4:39 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>> Here's one way to do it.  I find this more direct -- and more flexible
>> -- than what the manual seems to imply you should do, but I could
>> easily be missing something.
>>
>> webuse nlswork, clear
>> _pctile age, p(10(10)90)
>> ret li
>> gen decile = 10
>> qui forval i = 9(-1)1 {
>>         replace decile = `i' if age < r(r`i') } tabstat age,
>> by(decile) s(n sum mean)
>>
>>
>> On Thu, Oct 11, 2012 at 4:24 PM, Charles Vellutini
>> <charles.vellutini@ecopa.com> wrote:
>>> Thanks Nick and sorry for the lack of clarity.
>>>
>>> -decile- does not exist, I meant -centile-, my mistake.
>>>
>>> What I want are sums and means of the same variable that was used to determine the percentiles -- but now that you mention it, it would nice to have that on other variables too!
>>>
>>> Thanks,
>>> Charles
>>>
>>> -----Message d'origine-----
>>> De : owner-statalist@hsphsun2.harvard.edu
>>> [mailto:owner-statalist@hsphsun2.harvard.edu] De la part de Nick Cox
>>> Envoyé : jeudi 11 octobre 2012 17:15 À :
>>> statalist@hsphsun2.harvard.edu Objet : Re: st: Sums and means for
>>> each decile
>>>
>>> What's -decile-?
>>>
>>> You're right about -pctile-. But what do you want?
>>>
>>> sums, means of some variable y in classes determined by selected percentiles of another variable x?
>>>
>>> sums, means of a variable x in classes determined by selected percentiles of the same variable x?
>>>
>>> On Thu, Oct 11, 2012 at 4:09 PM, Charles Vellutini <charles.vellutini@ecopa.com> wrote:
>>>
>>>> I have looked at -decile- and -pctile- but neither provides the sums and means of each decile/percentile (only the cutoffs), if I am not mistaken.
>>>>
>>>> One could of course recover the cutoffs from -decile- and program this by hand. There is also the excellent -glcurve- available from SSC but this produces graphs; it creates variables that, yes, could also be used to compute the sums I suppose.
>>>>
>>>> But I thought that someone surely programmed this already?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```