-reshape- should be useful.
If I understand your data-set correctly, it is long, along these lines:
clear
input str4 var decile mean
var1 1 2
var1 2 5
var1 3 7
var2 1 4
var2 2 8
var2 3 9
end
reshape wide mean, i(var) j(decile)
HTH,
Philipp
Thomas Erdmann wrote:
Thanks for the further suggestion using -levelsof- ; I will go through it
tonight.
Based on the output produced I have now two types of variables:
(1) R* for each variable containing the mean return per decile
(2) G* for each variable containing the decile number 1 to 10
Basically I would like to produce a table like this (where the figures in
the table represent the mean returns of the deciles per variable):
1 2 3 ... 10
Var1 1.2 1.5 1.6 ... 2.3
Var2 0.9 0.7 0.6 ... 0.3
Varx
...
Varn
But somehow don't arrive at summarizing the data in a convenient way,
obviously this (below) does not work as after collapse all other variables
are gone.
foreach X of varlist c1* {
sort G_`X'
collapse (mean) RG_`X', by(G_`X')
}
Please excuse if this is very basic stuff, but I would appreciate a short
hint. Thanks.
- Tom
-----Ursprüngliche Nachricht-----
Von: statalist-owner@hsphsun2.harvard.edu
[mailto:statalist-owner@hsphsun2.harvard.edu] Im Auftrag von Jeph Herrin
Gesendet: Freitag, 10. November 2006 14:28
An: statalist@hsphsun2.harvard.edu
Betreff: Re: AW: st: RE: Decile sorts
So, using -levelsof- per Philipp's suggestion:
levelsof yrm, level(l)
foreach X of varlist c1* {
gen dec_`X'=.
foreach YRM in `l' {
xtile deciles=`X' if yrm==`YRM', n(10)
replace dec_`X'=deciles if yrm==`YRM'
drop deciles
}
bys dec_`X': egen Rr`X'=mean(c1ds_ri)
}
maybe?
jeph
Thomas Erdmann wrote:
A further note on Jeph's suggestion:
It looks very convenient, but I need to adjust for the fact that I do not
need the mean of the same item but of a different attribute:
foreach X of varlist c1* {
xtile deciles_`X'=`X', n(10)
bysort deciles_`X': egen Rr`X'=mean(c1ds_ri)
}
But a problem still remains:
the deciles are calculated over all observations - but what I need is
calculating the mean of deciles by yrm (my time variable representing
YearMonth) and afterwards the mean of all deciles groups (1-10) over all
yrm's. I was not able to integrate this into this short solution as -by-
is
not allowed for -xtile- .
-Tom
-----Ursprüngliche Nachricht-----
Von: statalist-owner@hsphsun2.harvard.edu
[mailto:statalist-owner@hsphsun2.harvard.edu] Im Auftrag von Jeph Herrin
Gesendet: Freitag, 10. November 2006 01:26
An: statalist@hsphsun2.harvard.edu
Betreff: Re: st: RE: Decile sorts
Oops, don't forget to drop -deciles-
foreach X of varlist c1* {
xtile deciles=`X', n(10)
bys deciles: egen R`X'=mean(`X')
drop deciles
}
Jeph Herrin wrote:
Maybe I'm missing something, but why not:
foreach X of varlist c1* {
xtile deciles=`X', n(10)
bys deciles: egen R`X'=mean(`X')
}
?
hth,
Jeph
Nick Cox wrote:
Various comments sprinkled here and there. You may have
strong reasons to use these decile bins, but binning strikes me as,
usually, at best a means towards an end (or perhaps ends towards some
means). Some nonparametric
regression might do more justice to the data.
Also, you are mixing two naming conventions 1...10 and 10...90. Just
use one.
Nick n.j.cox@durham.ac.uk
Thomas Erdmann
I am trying to sort my observations into deciles according to one
attribute
and afterwards calculating the average of another attribute of those
ten groups.
Please find the code I came up with below [lines with ... are
omitted], yrm is the time variable (YearMonth)
(1) As far as I can tell it works out, but a) it's a lot of code and
b)produces a lot of variables and c)generating the output is rather
awkward.
Could you give me hints on how to implement a smarter solution or if
there
are any errors in the way the calculation is carried out currently?
*** Generate Percentiles
sort yrm
foreach X of varlist c1* {
by yrm: egen p10_`X'= pctile(`X'), p(10.0)
by yrm: egen p20_`X'= pctile(`X'), p(20.0)
by yrm: egen p30_`X'= pctile(`X'), p(30.0)
...
by yrm: egen p90_`X'= pctile(`X'), p(90.0)
}
This is two loops rolled out into one.
sort yrm foreach X of varlist c1* { forval i =
10(10)90 { by yrm : egen p`i'_`X' = pctile(`X'), p(`i')
}
}
*** Sort into Percentile groups
foreach X of varlist c1* {
gen G_`X'=1 if `X'<p10_`X' & `X'~=.
replace G_`X'=2 if `X'>p10_`X' & `X'<p20_`X' ... replace
G_`X'=9 if `X'>p80_`X' & `X'<p90_`X' replace G_`X'=10 if
`X'>p90_`X' & `X'~=.
}
Similar story with boundary conditions.
foreach X of varlist c1* {
gen byte G_`X' = `X' < p10_`X'
forval i = 2/9 { local j = 10 * `i'
replace G_`X' = `i' if `X' < p`j'_`X' & G_`X' == 0 }
replace G_`X' = cond(`X' == ., ., 10) if G_`X' == 0 }
*** Calculate return mean for each group
sort yrm
foreach X of varlist G* {
by yrm: egen R1`X'= mean(c1ds_ri) if `X'==1
by yrm: egen R2`X'= mean(c1ds_ri) if `X'==2
...
by yrm: egen R9`X'= mean(c1ds_ri) if `X'==9
by yrm: egen R10`X'= mean(c1ds_ri) if `X'==10
}
Why do you need all these variables? The results for bin are disjoint,
so can be put in a single variable.
foreach X of varlist G* { bysort yrm `X' : egen R`X' =
mean(c1ds_ri)
}
Having said that, it can probably done more directly with a series of
-collapse-s.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/