# AW: st: RE: Decile sorts

 From "Thomas Erdmann" <[email protected]> To <[email protected]> Subject AW: st: RE: Decile sorts Date Fri, 10 Nov 2006 11:02:54 +0100

```A further note on Jeph's suggestion:

It looks very convenient, but I need to adjust for the fact that I do not
need the mean of the same item but of a different attribute:

foreach X of varlist c1* {
xtile deciles_`X'=`X', n(10)
bysort deciles_`X': egen Rr`X'=mean(c1ds_ri)
}

But a problem still remains:
the deciles are calculated over all observations - but what I need is
calculating the mean of deciles by yrm (my time variable representing
YearMonth) and afterwards the mean of all deciles groups (1-10) over all
yrm's. I was not able to integrate this into this short solution as -by- is
not allowed for -xtile- .

-Tom

-----Urspr�ngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Jeph Herrin
Gesendet: Freitag, 10. November 2006 01:26
An: [email protected]
Betreff: Re: st: RE: Decile sorts

Oops, don't forget to drop -deciles-

foreach X of varlist c1* {
xtile deciles=`X', n(10)
bys deciles: egen R`X'=mean(`X')
drop deciles
}

Jeph Herrin wrote:
> Maybe I'm missing something, but why not:
>
> foreach X of varlist c1* {
>    xtile deciles=`X', n(10)
>    bys deciles: egen R`X'=mean(`X')
> }
>
> ?
>
> hth,
> Jeph
>
>
> Nick Cox wrote:
>> Various comments sprinkled here and there. You may have
>> strong reasons to use these decile bins, but binning strikes me as,
>> usually, at best a means towards an end (or perhaps ends towards some
>> means). Some nonparametric
>> regression might do more justice to the data.
>> Also, you are mixing two naming conventions 1...10 and 10...90. Just
>> use one.
>> Nick [email protected]
>> Thomas Erdmann
>>
>>> I am trying to sort my observations into deciles according to one
>>> attribute
>>> and afterwards calculating the average of another attribute of those
>>> ten groups.
>>
>>> Please find the code I came up with below [lines with ... are
>>> omitted], yrm is the time variable (YearMonth)
>>>
>>> (1) As far as I can tell it works out, but a) it's a lot of code and
>>> b)produces a lot of variables and c)generating the output is rather
>>> awkward.
>>>
>>> Could you give me hints on how to implement a smarter solution or if
>>> there
>>> are any errors in the way the calculation is carried out currently?
>>
>>> *** Generate Percentiles
>>> sort yrm
>>>     foreach X of varlist c1* {
>>>     by yrm: egen p10_`X'= pctile(`X'), p(10.0)
>>>     by yrm: egen p20_`X'= pctile(`X'), p(20.0)
>>>     by yrm: egen p30_`X'= pctile(`X'), p(30.0)
>>>     ...
>>>     by yrm: egen p90_`X'= pctile(`X'), p(90.0)
>>>     }
>>
>> This is two loops rolled out into one.
>>     sort yrm     foreach X of varlist c1* {         forval i =
>> 10(10)90 {             by yrm : egen p`i'_`X' = pctile(`X'), p(`i')
>>         }
>>     }
>>
>>> *** Sort into Percentile groups
>>>     foreach X of varlist c1* {
>>>     gen G_`X'=1 if `X'<p10_`X' & `X'~=.
>>>     replace G_`X'=2 if `X'>p10_`X' & `X'<p20_`X'     ...     replace
>>> G_`X'=9 if `X'>p80_`X' & `X'<p90_`X'     replace G_`X'=10 if
>>> `X'>p90_`X' & `X'~=.
>>>     }
>>
>> Similar story with boundary conditions.
>>     foreach X of varlist c1* {
>>         gen byte G_`X' = `X' < p10_`X'
>>         forval i = 2/9 {             local j = 10 * `i'
>> replace G_`X' = `i' if `X' < p`j'_`X' & G_`X' == 0         }
>>         replace G_`X' = cond(`X' == ., ., 10) if G_`X' == 0     }
>>
>>
>>> *** Calculate return mean for each group
>>> sort yrm
>>>     foreach X of varlist G* {
>>>     by yrm: egen R1`X'= mean(c1ds_ri) if `X'==1
>>>     by yrm: egen R2`X'= mean(c1ds_ri) if `X'==2
>>>     ...
>>>     by yrm: egen R9`X'= mean(c1ds_ri) if `X'==9
>>>     by yrm: egen R10`X'= mean(c1ds_ri) if `X'==10
>>>     }
>>
>> Why do you need all these variables? The results for bin are disjoint,
>> so can be put in a single variable.
>>     foreach X of varlist G* {         bysort yrm `X' : egen R`X' =
>> mean(c1ds_ri)
>>     }
>> Having said that, it can probably done more directly with a series of
>> -collapse-s.
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/support/faqs/res/findit.html
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.409 / Virus Database: 268.14.0/524 - Release Date: 08.11.2006

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```