Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

AW: st: RE: Decile sorts


From   "Thomas Erdmann" <tom.erdmann@web.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   AW: st: RE: Decile sorts
Date   Fri, 10 Nov 2006 11:02:54 +0100

A further note on Jeph's suggestion:

It looks very convenient, but I need to adjust for the fact that I do not
need the mean of the same item but of a different attribute:

foreach X of varlist c1* {
	xtile deciles_`X'=`X', n(10)
	bysort deciles_`X': egen Rr`X'=mean(c1ds_ri)
	}

But a problem still remains: 
the deciles are calculated over all observations - but what I need is
calculating the mean of deciles by yrm (my time variable representing
YearMonth) and afterwards the mean of all deciles groups (1-10) over all
yrm's. I was not able to integrate this into this short solution as -by- is
not allowed for -xtile- . 

-Tom



 

-----Ursprüngliche Nachricht-----
Von: statalist-owner@hsphsun2.harvard.edu
[mailto:statalist-owner@hsphsun2.harvard.edu] Im Auftrag von Jeph Herrin
Gesendet: Freitag, 10. November 2006 01:26
An: statalist@hsphsun2.harvard.edu
Betreff: Re: st: RE: Decile sorts

Oops, don't forget to drop -deciles-

  foreach X of varlist c1* {
     xtile deciles=`X', n(10)
     bys deciles: egen R`X'=mean(`X')
     drop deciles
  }






Jeph Herrin wrote:
> Maybe I'm missing something, but why not:
> 
> foreach X of varlist c1* {
>    xtile deciles=`X', n(10)
>    bys deciles: egen R`X'=mean(`X')
> }
> 
> ?
> 
> hth,
> Jeph
> 
> 
> Nick Cox wrote:
>> Various comments sprinkled here and there. You may have
>> strong reasons to use these decile bins, but binning strikes me as, 
>> usually, at best a means towards an end (or perhaps ends towards some 
>> means). Some nonparametric
>> regression might do more justice to the data.
>> Also, you are mixing two naming conventions 1...10 and 10...90. Just 
>> use one.
>> Nick n.j.cox@durham.ac.uk
>> Thomas Erdmann
>>  
>>> I am trying to sort my observations into deciles according to one 
>>> attribute
>>> and afterwards calculating the average of another attribute of those 
>>> ten groups. 
>>  
>>> Please find the code I came up with below [lines with ... are 
>>> omitted], yrm is the time variable (YearMonth)
>>>
>>> (1) As far as I can tell it works out, but a) it's a lot of code and
>>> b)produces a lot of variables and c)generating the output is rather 
>>> awkward.
>>>
>>> Could you give me hints on how to implement a smarter solution or if 
>>> there
>>> are any errors in the way the calculation is carried out currently?
>>  
>>> *** Generate Percentiles
>>> sort yrm    
>>>     foreach X of varlist c1* {
>>>     by yrm: egen p10_`X'= pctile(`X'), p(10.0)
>>>     by yrm: egen p20_`X'= pctile(`X'), p(20.0)
>>>     by yrm: egen p30_`X'= pctile(`X'), p(30.0)
>>>     ...
>>>     by yrm: egen p90_`X'= pctile(`X'), p(90.0)
>>>     }
>>
>> This is two loops rolled out into one.
>>     sort yrm     foreach X of varlist c1* {         forval i = 
>> 10(10)90 {             by yrm : egen p`i'_`X' = pctile(`X'), p(`i') 
>>         }
>>     }
>>  
>>> *** Sort into Percentile groups
>>>     foreach X of varlist c1* {
>>>     gen G_`X'=1 if `X'<p10_`X' & `X'~=.
>>>     replace G_`X'=2 if `X'>p10_`X' & `X'<p20_`X'     ...     replace 
>>> G_`X'=9 if `X'>p80_`X' & `X'<p90_`X'     replace G_`X'=10 if 
>>> `X'>p90_`X' & `X'~=.
>>>     }
>>
>> Similar story with boundary conditions.
>>     foreach X of varlist c1* {
>>         gen byte G_`X' = `X' < p10_`X'        
>>         forval i = 2/9 {             local j = 10 * `i'             
>> replace G_`X' = `i' if `X' < p`j'_`X' & G_`X' == 0         }
>>         replace G_`X' = cond(`X' == ., ., 10) if G_`X' == 0     }
>>
>>  
>>> *** Calculate return mean for each group
>>> sort yrm
>>>     foreach X of varlist G* {
>>>     by yrm: egen R1`X'= mean(c1ds_ri) if `X'==1
>>>     by yrm: egen R2`X'= mean(c1ds_ri) if `X'==2
>>>     ...
>>>     by yrm: egen R9`X'= mean(c1ds_ri) if `X'==9
>>>     by yrm: egen R10`X'= mean(c1ds_ri) if `X'==10
>>>     }
>>
>> Why do you need all these variables? The results for bin are disjoint, 
>> so can be put in a single variable.
>>     foreach X of varlist G* {         bysort yrm `X' : egen R`X' = 
>> mean(c1ds_ri)
>>     }
>> Having said that, it can probably done more directly with a series of 
>> -collapse-s.
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/support/faqs/res/findit.html
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


-- 
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.409 / Virus Database: 268.14.0/524 - Release Date: 08.11.2006



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index