Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

AW: AW: st: RE: Decile sorts: output


From   "Thomas Erdmann" <tom.erdmann@web.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   AW: AW: st: RE: Decile sorts: output
Date   Fri, 10 Nov 2006 15:42:27 +0100

Thanks for the further suggestion using -levelsof- ; I will go through it
tonight. 

Based on the output produced I have now two types of variables: 
(1) R* for each variable containing the mean return per decile
(2) G* for each variable containing the decile number 1 to 10

Basically I would like to produce a table like this (where the figures in
the table represent the mean returns of the deciles per variable):

		1	2	3 	...	10
Var1		1.2	1.5	1.6	...	2.3
Var2		0.9	0.7	0.6	...	0.3
Varx		
...		
Varn	

But somehow don't arrive at summarizing the data in a convenient way,
obviously this (below) does not work as after collapse all other variables
are gone.

 	foreach X of varlist c1* {
	sort G_`X'
	collapse (mean) RG_`X', by(G_`X')
	}
 
Please excuse if this is very basic stuff, but I would appreciate a short
hint. Thanks.

- Tom




-----Ursprüngliche Nachricht-----
Von: statalist-owner@hsphsun2.harvard.edu
[mailto:statalist-owner@hsphsun2.harvard.edu] Im Auftrag von Jeph Herrin
Gesendet: Freitag, 10. November 2006 14:28
An: statalist@hsphsun2.harvard.edu
Betreff: Re: AW: st: RE: Decile sorts

So, using -levelsof- per Philipp's suggestion:


levelsof yrm, level(l)
foreach X of varlist c1* {
	gen dec_`X'=.
	foreach YRM in `l' {
		xtile deciles=`X' if yrm==`YRM', n(10)
		replace dec_`X'=deciles if yrm==`YRM'
		drop deciles
	}
	bys dec_`X': egen Rr`X'=mean(c1ds_ri)
}

maybe?
jeph


Thomas Erdmann wrote:
> A further note on Jeph's suggestion:
> 
> It looks very convenient, but I need to adjust for the fact that I do not
> need the mean of the same item but of a different attribute:
> 
> foreach X of varlist c1* {
> 	xtile deciles_`X'=`X', n(10)
> 	bysort deciles_`X': egen Rr`X'=mean(c1ds_ri)
> 	}
> 
> But a problem still remains: 
> the deciles are calculated over all observations - but what I need is
> calculating the mean of deciles by yrm (my time variable representing
> YearMonth) and afterwards the mean of all deciles groups (1-10) over all
> yrm's. I was not able to integrate this into this short solution as -by-
is
> not allowed for -xtile- . 
> 
> -Tom
> 
> 
> 
>  
> 
> -----Ursprüngliche Nachricht-----
> Von: statalist-owner@hsphsun2.harvard.edu
> [mailto:statalist-owner@hsphsun2.harvard.edu] Im Auftrag von Jeph Herrin
> Gesendet: Freitag, 10. November 2006 01:26
> An: statalist@hsphsun2.harvard.edu
> Betreff: Re: st: RE: Decile sorts
> 
> Oops, don't forget to drop -deciles-
> 
>   foreach X of varlist c1* {
>      xtile deciles=`X', n(10)
>      bys deciles: egen R`X'=mean(`X')
>      drop deciles
>   }
> 
> 
> 
> 
> 
> 
> Jeph Herrin wrote:
>> Maybe I'm missing something, but why not:
>>
>> foreach X of varlist c1* {
>>    xtile deciles=`X', n(10)
>>    bys deciles: egen R`X'=mean(`X')
>> }
>>
>> ?
>>
>> hth,
>> Jeph
>>
>>
>> Nick Cox wrote:
>>> Various comments sprinkled here and there. You may have
>>> strong reasons to use these decile bins, but binning strikes me as, 
>>> usually, at best a means towards an end (or perhaps ends towards some 
>>> means). Some nonparametric
>>> regression might do more justice to the data.
>>> Also, you are mixing two naming conventions 1...10 and 10...90. Just 
>>> use one.
>>> Nick n.j.cox@durham.ac.uk
>>> Thomas Erdmann
>>>  
>>>> I am trying to sort my observations into deciles according to one 
>>>> attribute
>>>> and afterwards calculating the average of another attribute of those 
>>>> ten groups. 
>>>  
>>>> Please find the code I came up with below [lines with ... are 
>>>> omitted], yrm is the time variable (YearMonth)
>>>>
>>>> (1) As far as I can tell it works out, but a) it's a lot of code and
>>>> b)produces a lot of variables and c)generating the output is rather 
>>>> awkward.
>>>>
>>>> Could you give me hints on how to implement a smarter solution or if 
>>>> there
>>>> are any errors in the way the calculation is carried out currently?
>>>  
>>>> *** Generate Percentiles
>>>> sort yrm    
>>>>     foreach X of varlist c1* {
>>>>     by yrm: egen p10_`X'= pctile(`X'), p(10.0)
>>>>     by yrm: egen p20_`X'= pctile(`X'), p(20.0)
>>>>     by yrm: egen p30_`X'= pctile(`X'), p(30.0)
>>>>     ...
>>>>     by yrm: egen p90_`X'= pctile(`X'), p(90.0)
>>>>     }
>>> This is two loops rolled out into one.
>>>     sort yrm     foreach X of varlist c1* {         forval i = 
>>> 10(10)90 {             by yrm : egen p`i'_`X' = pctile(`X'), p(`i') 
>>>         }
>>>     }
>>>  
>>>> *** Sort into Percentile groups
>>>>     foreach X of varlist c1* {
>>>>     gen G_`X'=1 if `X'<p10_`X' & `X'~=.
>>>>     replace G_`X'=2 if `X'>p10_`X' & `X'<p20_`X'     ...     replace 
>>>> G_`X'=9 if `X'>p80_`X' & `X'<p90_`X'     replace G_`X'=10 if 
>>>> `X'>p90_`X' & `X'~=.
>>>>     }
>>> Similar story with boundary conditions.
>>>     foreach X of varlist c1* {
>>>         gen byte G_`X' = `X' < p10_`X'        
>>>         forval i = 2/9 {             local j = 10 * `i'             
>>> replace G_`X' = `i' if `X' < p`j'_`X' & G_`X' == 0         }
>>>         replace G_`X' = cond(`X' == ., ., 10) if G_`X' == 0     }
>>>
>>>  
>>>> *** Calculate return mean for each group
>>>> sort yrm
>>>>     foreach X of varlist G* {
>>>>     by yrm: egen R1`X'= mean(c1ds_ri) if `X'==1
>>>>     by yrm: egen R2`X'= mean(c1ds_ri) if `X'==2
>>>>     ...
>>>>     by yrm: egen R9`X'= mean(c1ds_ri) if `X'==9
>>>>     by yrm: egen R10`X'= mean(c1ds_ri) if `X'==10
>>>>     }
>>> Why do you need all these variables? The results for bin are disjoint, 
>>> so can be put in a single variable.
>>>     foreach X of varlist G* {         bysort yrm `X' : egen R`X' = 
>>> mean(c1ds_ri)
>>>     }
>>> Having said that, it can probably done more directly with a series of 
>>> -collapse-s.
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/support/faqs/res/findit.html
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/support/faqs/res/findit.html
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


-- 
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.409 / Virus Database: 268.14.0/524 - Release Date: 08.11.2006



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index