Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: AW: AW: st: RE: Decile sorts: output


From   "Le Wang" <[email protected]>
To   [email protected]
Subject   Re: AW: AW: st: RE: Decile sorts: output
Date   Fri, 10 Nov 2006 15:51:52 -0600

I can only come up with an inefficient way of accomplishing this.
Hopefully it works.

Le

Let's assume that we have n variables

=====================================
tempfile 1 2 3 .... n

forvalues i=1/n{
	preserve

	keep decile`i' meanvar`i'
	bys decille`i': keep if _n==1
	rename decile`i' decile
	sort decile
	save `"`i'"', replace

	restore, preserve
}

use `"1"', clear

forvalues i=2/n{
	merge decile, using `"`i'"'
	drop _merge
}

xpose, clear
list

========================================


On 11/10/06, Thomas Erdmann <[email protected]> wrote:
To be more precise the data looks like:

               DecileVar1      MeanVar1        DecileVar2      MeanVar2
...
Obs1            1               0.2             1               0.5
Obs2            1               0.2             8               0.7
Obs3            4               0.6             8               0.7
...
Obsn


While it should look like indicated below.
- Tom


-----Urspr�ngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Philipp Rehm
Gesendet: Freitag, 10. November 2006 15:58
An: [email protected]
Betreff: Re: AW: AW: st: RE: Decile sorts: output

-reshape- should be useful.

If I understand your data-set correctly, it is long, along these lines:

clear
input str4 var decile mean
var1 1 2
var1 2 5
var1 3 7
var2 1 4
var2 2 8
var2 3 9
end

reshape wide mean, i(var) j(decile)

HTH,
Philipp

Thomas Erdmann wrote:
> Thanks for the further suggestion using -levelsof- ; I will go through it
> tonight.
>
> Based on the output produced I have now two types of variables:
> (1) R* for each variable containing the mean return per decile
> (2) G* for each variable containing the decile number 1 to 10
>
> Basically I would like to produce a table like this (where the figures in
> the table represent the mean returns of the deciles per variable):
>
>               1       2       3       ...     10
> Var1          1.2     1.5     1.6     ...     2.3
> Var2          0.9     0.7     0.6     ...     0.3
> Varx
> ...
> Varn
>
> But somehow don't arrive at summarizing the data in a convenient way,
> obviously this (below) does not work as after collapse all other variables
> are gone.
>
>       foreach X of varlist c1* {
>       sort G_`X'
>       collapse (mean) RG_`X', by(G_`X')
>       }
>
> Please excuse if this is very basic stuff, but I would appreciate a short
> hint. Thanks.
>
> - Tom
>
>
>
>
> -----Urspr�ngliche Nachricht-----
> Von: [email protected]
> [mailto:[email protected]] Im Auftrag von Jeph Herrin
> Gesendet: Freitag, 10. November 2006 14:28
> An: [email protected]
> Betreff: Re: AW: st: RE: Decile sorts
>
> So, using -levelsof- per Philipp's suggestion:
>
>
> levelsof yrm, level(l)
> foreach X of varlist c1* {
>       gen dec_`X'=.
>       foreach YRM in `l' {
>               xtile deciles=`X' if yrm==`YRM', n(10)
>               replace dec_`X'=deciles if yrm==`YRM'
>               drop deciles
>       }
>       bys dec_`X': egen Rr`X'=mean(c1ds_ri)
> }
>
> maybe?
> jeph
>
>
> Thomas Erdmann wrote:
>> A further note on Jeph's suggestion:
>>
>> It looks very convenient, but I need to adjust for the fact that I do not
>> need the mean of the same item but of a different attribute:
>>
>> foreach X of varlist c1* {
>>      xtile deciles_`X'=`X', n(10)
>>      bysort deciles_`X': egen Rr`X'=mean(c1ds_ri)
>>      }
>>
>> But a problem still remains:
>> the deciles are calculated over all observations - but what I need is
>> calculating the mean of deciles by yrm (my time variable representing
>> YearMonth) and afterwards the mean of all deciles groups (1-10) over all
>> yrm's. I was not able to integrate this into this short solution as -by-
> is
>> not allowed for -xtile- .
>>
>> -Tom
>>
>>
>>
>>
>>
>> -----Urspr�ngliche Nachricht-----
>> Von: [email protected]
>> [mailto:[email protected]] Im Auftrag von Jeph Herrin
>> Gesendet: Freitag, 10. November 2006 01:26
>> An: [email protected]
>> Betreff: Re: st: RE: Decile sorts
>>
>> Oops, don't forget to drop -deciles-
>>
>>   foreach X of varlist c1* {
>>      xtile deciles=`X', n(10)
>>      bys deciles: egen R`X'=mean(`X')
>>      drop deciles
>>   }
>>
>>
>>
>>
>>
>>
>> Jeph Herrin wrote:
>>> Maybe I'm missing something, but why not:
>>>
>>> foreach X of varlist c1* {
>>>    xtile deciles=`X', n(10)
>>>    bys deciles: egen R`X'=mean(`X')
>>> }
>>>
>>> ?
>>>
>>> hth,
>>> Jeph
>>>
>>>
>>> Nick Cox wrote:
>>>> Various comments sprinkled here and there. You may have
>>>> strong reasons to use these decile bins, but binning strikes me as,
>>>> usually, at best a means towards an end (or perhaps ends towards some
>>>> means). Some nonparametric
>>>> regression might do more justice to the data.
>>>> Also, you are mixing two naming conventions 1...10 and 10...90. Just
>>>> use one.
>>>> Nick [email protected]
>>>> Thomas Erdmann
>>>>
>>>>> I am trying to sort my observations into deciles according to one
>>>>> attribute
>>>>> and afterwards calculating the average of another attribute of those
>>>>> ten groups.
>>>>
>>>>> Please find the code I came up with below [lines with ... are
>>>>> omitted], yrm is the time variable (YearMonth)
>>>>>
>>>>> (1) As far as I can tell it works out, but a) it's a lot of code and
>>>>> b)produces a lot of variables and c)generating the output is rather
>>>>> awkward.
>>>>>
>>>>> Could you give me hints on how to implement a smarter solution or if
>>>>> there
>>>>> are any errors in the way the calculation is carried out currently?
>>>>
>>>>> *** Generate Percentiles
>>>>> sort yrm
>>>>>     foreach X of varlist c1* {
>>>>>     by yrm: egen p10_`X'= pctile(`X'), p(10.0)
>>>>>     by yrm: egen p20_`X'= pctile(`X'), p(20.0)
>>>>>     by yrm: egen p30_`X'= pctile(`X'), p(30.0)
>>>>>     ...
>>>>>     by yrm: egen p90_`X'= pctile(`X'), p(90.0)
>>>>>     }
>>>> This is two loops rolled out into one.
>>>>     sort yrm     foreach X of varlist c1* {         forval i =
>>>> 10(10)90 {             by yrm : egen p`i'_`X' = pctile(`X'), p(`i')
>>>>         }
>>>>     }
>>>>
>>>>> *** Sort into Percentile groups
>>>>>     foreach X of varlist c1* {
>>>>>     gen G_`X'=1 if `X'<p10_`X' & `X'~=.
>>>>>     replace G_`X'=2 if `X'>p10_`X' & `X'<p20_`X'     ...     replace
>>>>> G_`X'=9 if `X'>p80_`X' & `X'<p90_`X'     replace G_`X'=10 if
>>>>> `X'>p90_`X' & `X'~=.
>>>>>     }
>>>> Similar story with boundary conditions.
>>>>     foreach X of varlist c1* {
>>>>         gen byte G_`X' = `X' < p10_`X'
>>>>         forval i = 2/9 {             local j = 10 * `i'
>>>> replace G_`X' = `i' if `X' < p`j'_`X' & G_`X' == 0         }
>>>>         replace G_`X' = cond(`X' == ., ., 10) if G_`X' == 0     }
>>>>
>>>>
>>>>> *** Calculate return mean for each group
>>>>> sort yrm
>>>>>     foreach X of varlist G* {
>>>>>     by yrm: egen R1`X'= mean(c1ds_ri) if `X'==1
>>>>>     by yrm: egen R2`X'= mean(c1ds_ri) if `X'==2
>>>>>     ...
>>>>>     by yrm: egen R9`X'= mean(c1ds_ri) if `X'==9
>>>>>     by yrm: egen R10`X'= mean(c1ds_ri) if `X'==10
>>>>>     }
>>>> Why do you need all these variables? The results for bin are disjoint,
>>>> so can be put in a single variable.
>>>>     foreach X of varlist G* {         bysort yrm `X' : egen R`X' =
>>>> mean(c1ds_ri)
>>>>     }
>>>> Having said that, it can probably done more directly with a series of
>>>> -collapse-s.
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/support/faqs/res/findit.html
>>>> *   http://www.stata.com/support/statalist/faq
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>
>>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/support/faqs/res/findit.html
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/support/faqs/res/findit.html
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.409 / Virus Database: 268.14.0/524 - Release Date: 08.11.2006



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


--
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Le Wang, Ph.D.
Minnesota Population Center
University of Minnesota
(o) 612-624-5818

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index