Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

<> So, Lars, this is getting quite challenging, not least because we do not have your data. The code you posted seems to assume the presence of your data (what is "pricemsci", for instance?) The -reshape-ing we did a couple of hours ago now increasingly is looking like a bit of a red herring, and it certainly does not help on your way to a solution for your problem. Here is code that builds on our state of affairs before we -reshape-d. Note I am creating a fake return in there, so the variances calculated earlier have no connection to it. Everything else would require substantial restructuring of the solution. The code generates three portfolios that contain "low", "middle" and "high" variance stocks: *********** //create resultsfile cap erase myfile.dta di in red _rc clear* gen start=. gen end=. gen _stat_1=. gen stock=. gen str15 kindofreturn="" save myfile, replace //get "3105.dta" clear* //8 stocks set obs 8 gen byte stock=_n //5 time periods expand 5 bys stock: gen byte time=_n gen double exret=rnormal() gen double msciret=rnormal() gen double msftret=rnormal() gen double appret=rnormal() gen double geret=rnormal() gen double pgret=rnormal() gen double jnjret=rnormal() gen double bpret=rnormal() save 3105, replace //-use- "3105" u 3105, clear //Return calculation gen double grexret=ex[_n]/ex[_n-1]-1 if _n>1 gen double grmsciret=msci[_n]/msci[_n-1]-1 if _n>1 gen double grmsftret=msft[_n]/msft[_n-1]-1 if _n>1 gen double grappret=app[_n]/app[_n-1]-1 if _n>1 gen double grgeret=ge[_n]/ge[_n-1]-1 if _n>1 gen double grpgret=pg[_n]/pg[_n-1]-1 if _n>1 gen double grjnjret=jnj[_n]/jnj[_n-1]-1 if _n>1 gen double grbpret=bp[_n]/bp[_n-1]-1 if _n>1 //loop to get -rolling- results for each stock //and each return foreach ret in exret msciret msftret{ //start inner loop su stock, mean qui forv i=1/`r(max)'{ preserve keep if stock==`i' tsset time rolling r(Var), window(2) clear: su `ret' gen stock=`i' gen kindofreturn="`ret'" append using myfile save myfile, replace restore } //end inner loop } u myfile, clear ren _stat_1 Variance sort stock kindofreturn start la def sto 1 "Firm 1" 2 "Firm 2" 3 "Firm 3" /// 4 "Firm 4" 5 "Firm 5" 6 "Firm 6" 7 "Firm 7" /// 8 "Firm 8" la val stock sto //get "low"/"middle"/"high" volatility portfolios bys start kindofreturn (Variance): gen byte lowvar=_n<=3 bys start kindofreturn (Variance): gen byte middlevar=inlist(_n,4,5) bys start kindofreturn (Variance): gen byte highvar=_n>5 l, h(30) noo sepby(start kindofreturn lowvar middlevar highvar) //generate very fake return gen myreturn=rnormal(.1,.05) bys start kindofreturn (lowvar): /// egen averagereturnlow=mean(myreturn) if lowvar bys start kindofreturn (middlevar): /// egen averagereturnmiddle=mean(myreturn) if middlevar bys start kindofreturn (highvar): /// egen averagereturnhigh=mean(myreturn) if highvar sort start kindofreturn lowvar middlevar highvar l start Variance myreturn stock lowvar averagereturn*, /// noo sepby(start kindofreturn low middle high) *********** HTH Martin -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Lars Knuth Sent: Samstag, 5. Juni 2010 21:16 To: statalist@hsphsun2.harvard.edu Subject: Re: st: RE: Calculate variances of subsamples Dear Statalisters, I want to share the results, maybe there is someone (in the future), who has the same problem to solve. The input came almost completely from Martin. The program takes price data for a large number of stocks, calculates returns, calculates then variances for every stock individually many times using a rolling window. The variances are in the right order to be compared in the cross-section. This is also the beginning of my next and last problem: I have all the variances. I now need to compare at every point in time all the variances for the different stocks (Exxon, Microsoft etc), rank them with the lowest variance first (at every point in time), then build 10 portfolios (it will be more than 1000 stocks) where the first portfolio for example exists of the stocks with the 10% lowest variances, the second includes the next ten percent etc. For those portfolios, the return has to be calculated (so the datasets have to be merged again). Then it can be tested (t-test) whether the low variances stock portfolio has a statistically significantly lower return than the high variance portfolio. As I said, if someone (Martin?) has an idea for that, I would be more than thankful since in this case I can finish my Stata work for the moment. Thanks in advance! clear* set more off *create resultsfile cap erase resultsfile.dta di in red _rc clear* gen start=. gen end=. gen _stat_1=. gen str15 kindofreturn="" save resultsfile, replace use "\3105.dta", clear gen int time=_n *Return calculation foreach price in pricemsci priceex pricemsft priceapp pricege pricepg pricejnj pricebp { gen double `price'ret=`price'[_n]/`price'[_n-1]-1 if _n>1 } renpfix price *loop to get -rolling- results for each stock foreach ret in exret msciret msftret appret geret pgret jnjret bpret{ preserve tsset time rolling r(Var), window(60) clear: su `ret' gen kindofreturn="`ret'" append using resultsfile save resultsfile, replace restore } u resultsfile, clear ren _stat_1 Variance sort kindofreturn start l, sepby(kindofreturn) noo bys kindofreturn: gen int time=_n reshape wide Variance, i(time) j(kindofreturn) string renpfix Variance list, noo 2010/6/5 Martin Weiss <martin.weiss1@gmx.de>: > > <> > > *********** > clear* > > input Var str16 stock > 0.00234 exxon > .05654 exxon > 0.13444 exxon > 0.99388 microsoft > .4342 microsoft > 0.42445 microsoft > 0.42444 intel > 0.32443 intel > 0.23434 intel > end > > bys stock: gen int time=_n > reshape wide Var, i(time) j(stock) string > renpfix Var > list, noo > *********** > > > HTH > Martin > > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Lars Knuth > Sent: Samstag, 5. Juni 2010 17:49 > To: statalist@hsphsun2.harvard.edu > Subject: Re: st: RE: Calculate variances of subsamples > > Oh, my explanation was probably irritating. It was just for > illustration. I have two columns, one with the numbers, the other > having the strings. What I need to have in a new file is just the > numbers. > > 0.00234(exxon) says that there should be the 0.00234, which depends to > exxon etc. It were of course nice if the new variable including the > exxon numbers would be named exxon, the one having the numbers for > microsoft could be named microsoft etc. However, the important part > concerns just the numbers. > > 2010/6/5 Martin Weiss <martin.weiss1@gmx.de>: >> >> <> >> >> Your problem may turn out to be easily solved with a -reshape-. It is not > a >> good idea, though, to have "0.00234(exxon)" in a cell of your data, as > this >> would have to be stored as a string, precluding any further processing of >> the number. Did you write it as an illustration, or do you really want the >> cell to contain the string? >> >> >> HTH >> Martin >> >> >> -----Original Message----- >> From: owner-statalist@hsphsun2.harvard.edu >> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Lars Knuth >> Sent: Samstag, 5. Juni 2010 17:28 >> To: statalist@hsphsun2.harvard.edu >> Subject: Re: st: RE: Calculate variances of subsamples >> >> Ok, great, it took some time, but I finally understood Martin`s >> code... this is a great way of learning more about STATA. >> My next problem is that I have a variable with the variances for 536 >> -rolling- steps for each of the stocks. >> It looks like this: >> Variance stock name >> 0.00234 exxon >> ........... exxon >> 0.13444 exxon >> 0.99388 microsoft >> ........... microsoft >> 0.42445 microsoft >> 0.42444 intel >> ........ intel >> 0.23434 intel >> >> What I would like to have is the following: >> >> 0.00234(exxon) 0.99388(microsoft) 0.42444(intel) >> ........... ............ > ............ >> 0.13444 0.42445 0.23434(intel) >> >> I could do gen varexxon=Variance if stockname=="exxon" >> and that for all the stocks. But even if I do so I get variables with >> a lot of missings and I can not write the variances horizontally next >> to each other. >> >> But they are from the same time (because of -rolling-) and I need them >> to be horizontally ordered without the missings. >> >> I hope my problem becomes clear. I guess what I miss is just a small >> command. >> Thank you in advance for any hint! >> >> 2010/6/2 Martin Weiss <martin.weiss1@gmx.de>: >>> >>> <> >>> >>> You could of course issue the -rolling- call with -clear- present, -save- >>> the result to a new file and reload your "3105.dta" to start anew for the >>> next stock. The datasets thus -saved- could be -append-ed to form one big >>> dataset afterwards. -postfile- is also an option, as always. >>> >>> BTW, you may be better of with the lag operator "L." for your return >>> calculations. >>> >>> >>> HTH >>> Martin >>> >>> -----Original Message----- >>> From: owner-statalist@hsphsun2.harvard.edu >>> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Lars Knuth >>> Sent: Mittwoch, 2. Juni 2010 20:22 >>> To: statalist >>> Subject: st: Calculate variances of subsamples >>> >>> Dear listers, >>> >>> I have to say thanks to Martin, the recommendation of rolling was >>> great. Unfortunately, I have now a few problems with the >>> implementation. >>> 1. -rolling- works with the "clear" option, but without it does not >>> ("rolling r(Var), window(60) clear: summarize exret" works) >>> 2. I need the data to calculate and store the variances for more than >>> 1000 stock price returns in the end, so can I somehow keep all the >>> data and then perform -rolling- in a loop? >>> 3. Is there also an opportunity to perform the return calculation in a >> loop? >>> >>> I am attaching parts of the code I have so far. clear*
use "C:\...\3105.dta", clear

gen int time=_n
* Return calculation
gen double exret=ex[_n]/ex[_n-1]-1 if _n>1
gen double msciret=msci[_n]/msci[_n-1]-1 if _n>1
gen double msftret=msft[_n]/msft[_n-1]-1 if _n>1
gen double appret=app[_n]/app[_n-1]-1 if _n>1
gen double geret=ge[_n]/ge[_n-1]-1 if _n>1
gen double pgret=pg[_n]/pg[_n-1]-1 if _n>1
gen double jnjret=jnj[_n]/jnj[_n-1]-1 if _n>1
gen double bpret=bp[_n]/bp[_n-1]-1 if _n>1

tsset time

* Rolling
rolling r(Var), window(60): summarize exret
rolling r(Var), window(60): summarize msciret
rolling r(Var), window(60): summarize msftret

