I guess you're onto something good, except
that second time around the loop -deciles-
already exists. So this needs a tweak,
depending on whether -deciles- is dispensable.
Nick
n.j.cox@durham.ac.uk
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Jeph Herrin
> Sent: 09 November 2006 23:46
> To: statalist@hsphsun2.harvard.edu
> Subject: Re: st: RE: Decile sorts
>
>
> Maybe I'm missing something, but why not:
>
> foreach X of varlist c1* {
> xtile deciles=`X', n(10)
> bys deciles: egen R`X'=mean(`X')
> }
>
> ?
>
> hth,
> Jeph
>
>
> Nick Cox wrote:
> > Various comments sprinkled here and there. You may have
> > strong reasons to use these decile bins, but binning
> > strikes me as, usually, at best a means towards an end
> > (or perhaps ends towards some means). Some nonparametric
> > regression might do more justice to the data.
> >
> > Also, you are mixing two naming conventions 1...10
> > and 10...90. Just use one.
> >
> > Nick
> > n.j.cox@durham.ac.uk
> >
> > Thomas Erdmann
> >
> >> I am trying to sort my observations into deciles according to
> >> one attribute
> >> and afterwards calculating the average of another attribute
> >> of those ten groups.
> >
> >> Please find the code I came up with below [lines with ... are
> >> omitted], yrm is the time variable (YearMonth)
> >>
> >> (1) As far as I can tell it works out, but a) it's a lot
> of code and
> >> b)produces a lot of variables and c)generating the output is
> >> rather awkward.
> >>
> >> Could you give me hints on how to implement a smarter
> >> solution or if there
> >> are any errors in the way the calculation is carried out currently?
> >
> >> *** Generate Percentiles
> >> sort yrm
> >> foreach X of varlist c1* {
> >> by yrm: egen p10_`X'= pctile(`X'), p(10.0)
> >> by yrm: egen p20_`X'= pctile(`X'), p(20.0)
> >> by yrm: egen p30_`X'= pctile(`X'), p(30.0)
> >> ...
> >> by yrm: egen p90_`X'= pctile(`X'), p(90.0)
> >> }
> >
> > This is two loops rolled out into one.
> >
> > sort yrm
> > foreach X of varlist c1* {
> > forval i = 10(10)90 {
> > by yrm : egen p`i'_`X' = pctile(`X'), p(`i')
> > }
> > }
> >
> >
> >> *** Sort into Percentile groups
> >> foreach X of varlist c1* {
> >> gen G_`X'=1 if `X'<p10_`X' & `X'~=.
> >> replace G_`X'=2 if `X'>p10_`X' & `X'<p20_`X'
> >> ...
> >> replace G_`X'=9 if `X'>p80_`X' & `X'<p90_`X'
> >> replace G_`X'=10 if `X'>p90_`X' & `X'~=.
> >> }
> >
> > Similar story with boundary conditions.
> >
> > foreach X of varlist c1* {
> > gen byte G_`X' = `X' < p10_`X'
> >
> > forval i = 2/9 {
> > local j = 10 * `i'
> > replace G_`X' = `i' if `X' < p`j'_`X' &
> G_`X' == 0
> > }
> >
> > replace G_`X' = cond(`X' == ., ., 10) if G_`X' == 0
> > }
> >
> >
> >> *** Calculate return mean for each group
> >> sort yrm
> >> foreach X of varlist G* {
> >> by yrm: egen R1`X'= mean(c1ds_ri) if `X'==1
> >> by yrm: egen R2`X'= mean(c1ds_ri) if `X'==2
> >> ...
> >> by yrm: egen R9`X'= mean(c1ds_ri) if `X'==9
> >> by yrm: egen R10`X'= mean(c1ds_ri) if `X'==10
> >> }
> >
> > Why do you need all these variables? The results
> > for bin are disjoint, so can be put in a single
> > variable.
> >
> > foreach X of varlist G* {
> > bysort yrm `X' : egen R`X' = mean(c1ds_ri)
> > }
> >
> > Having said that, it can probably done more
> > directly with a series of -collapse-s.
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/support/faqs/res/findit.html
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> >
> >
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/