Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Carlos Avellaneda Suárez <carlos.avellaneda8@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: RE: generating annualized standard deviation of returns from monthly data. |
Date | Thu, 27 Feb 2014 12:11:09 -0500 |
In case your "year" variable is in Stata's date format (it is not specified by you), you should try this: rename year date generate year = year(date) bysort firm year: egen SD=sd(return) Hope this helps. Carlos 2014-02-27 12:06 GMT-05:00 Carlos Avellaneda Suárez <carlos.avellaneda8@gmail.com>: > The problem is that your "year" variable is not actually a year > variable, but day variable, and each combination of firm - "year" > always represents a unique observation in your dataset, to which it is > impossible to calculate a standard deviation. You have to create a > real year variable in order to obtain what you want. > > 2014-02-27 11:56 GMT-05:00 Ikechukwu M. <bigdoctor2004@gmail.com>: >> Thank you. >> >> here is what I get when I perform either of the two commands. >> >> I agree that without the year grouping variable there should be one sd >> returned per firm. It is including the year grouping variable that >> messes things up. >> >> >> year tic return sd_return >> 78. 31jan2000 0183B -10.71428571 . >> 79. 29feb2000 0183B 48 . >> 80. 31mar2000 0183B -29.72972973 . >> ------------------------------------------------ >> 81. 30apr2000 0183B 7.692307692 . >> 82. 31may2000 0183B -17.85714286 . >> 83. 30jun2000 0183B 39.13043478 . >> 84. 31jul2000 0183B -18.75 . >> 85. 31aug2000 0183B 61.53846154 . >> ------------------------------------------------ >> 86. 30sep2000 0183B -33.33333333 . >> 87. 31oct2000 0183B 14.28571429 . >> 88. 30nov2000 0183B -18.75 . >> 89. 31dec2000 0183B -7.692307692 . >> 90. 31jan2001 0183B 37.5 . >> ------------------------------------------------ >> 91. 28feb2001 0183B -27.27272727 . >> 92. 31mar2001 0183B 50 . >> 93. 30apr2001 0183B -18.22222222 . >> 94. 31may2001 0183B 25 . >> 95. 30jun2001 0183B -6.086956522 . >> ------------------------------------------------ >> 96. 31jul2001 0183B -20.83333333 . >> 97. 31aug2001 0183B 2.339181287 . >> 98. 30sep2001 0183B -22.85714286 . >> 99. 31oct2001 0183B 39.25925926 . >> 100. 30nov2001 0183B -20.21276596 . >> ------------------------------------------------ >> 101. 31dec2001 0183B -.6666666667 . >> 102. 31jan2002 0183B 9.395973154 . >> 103. 28feb2002 0183B 0 . >> 104. 31jan2000 0223B 0 . >> 105. 29feb2000 0223B 5.551515152 . >> ------------------------------------------------ >> 106. 31mar2000 0223B 1.447178003 . >> 107. 30apr2000 0223B .4279600571 . >> 108. 31may2000 0223B 0 . >> 109. 31jan2000 0226B 0 . >> 110. 29feb2000 0226B 0 . >> ------------------------------------------------ >> 111. 31mar2000 0226B 0 . >> 112. 30apr2000 0226B 0 . >> 113. 31may2000 0226B 800 . >> 114. 30jun2000 0226B -33.33333333 . >> 115. 31jul2000 0226B 0 . >> ------------------------------------------------ >> 116. 31aug2000 0226B 0 . >> >> >> This result is obtained from bysort firm year: egen SD=sd(return) >> >> Thanks again. >> >> IK >> >> On Thu, Feb 27, 2014 at 10:47 AM, Nick Cox <njcoxstata@gmail.com> wrote: >>> If you don't specify the year as a grouping variable, then values for >>> different years are lumped together; that is precisely as it should >>> be. >>> >>> Otherwise, I can't make sense of the claim that you get missing for SD >>> with (e.g.) 6 non-missing values. -collapse- produces a missing SD if >>> all values (or all but one) values are missing in a group, but not >>> otherwise. (The "all but one" follows from the use of (n - 1) rather >>> than n in the formula for SD, n being sample size as usual.) >>> >>> If you were expecting that missing values would be omitted from the >>> -collapse- results, that expectation was incorrect. >>> >>> To make clear your perceived problem, we need to see data and output, >>> e.g. for examples like that below. >>> >>> . clear >>> >>> . input firm year return >>> >>> firm year return >>> 1. 1 2000 0.875 >>> 2. 1 2000 1.2 >>> 3. 1 2000 0.9 >>> 4. 1 2000 0.35 >>> 5. 1 2000 0.98 >>> 6. 1 2000 1.4 >>> 7. 1 2000 . >>> 8. 1 2000 . >>> 9. 1 2000 . >>> 10. 1 2000 . >>> 11. 1 2000 . >>> 12. 1 2000 . >>> 13. 1 2001 . >>> 14. 1 2001 . >>> 15. end >>> >>> . collapse (sd) return, by(firm year) >>> >>> . list >>> >>> +------------------------+ >>> | firm year return | >>> |------------------------| >>> 1. | 1 2000 .3560957 | >>> 2. | 1 2001 . | >>> +------------------------+ >>> >>> Nick >>> njcoxstata@gmail.com >>> >>> >>> On 27 February 2014 15:28, Ikechukwu M. <bigdoctor2004@gmail.com> wrote: >>>> Thanks. Apologies for incorrect attribution to Nick Cox. What I meant >>>> to say is that occurrence of missing values collapses to a missing, >>>> even though I expected the missings to be ignored. >>>> Thanks for the input - I have implemented what you both suggest and >>>> the good news is that it resolves to the same thing so it is working >>>> but not producing the desired output. I am ending up with missing >>>> values even for firms that have 6 monthly observations for the year. >>>> >>>> The collapse code I used is this: >>>> collapse (sd) sd_return=return, by(firm year) >>>> >>>> using bysort firm year: egen SD=sd(return) >>>> >>>> but when I omit the year, sd is appropriately computed but for all 10 >>>> years of the data, not partitioned into years. >>>> >>>> When I include the year, I end up with lots of missing observations. >>>> >>>> Thanks >>>> >>>> On Thu, Feb 27, 2014 at 4:21 AM, Nick Cox <njcoxstata@gmail.com> wrote: >>>>> There are various "Nick"s around here. In my case, I wouldn't offer >>>>> the explanation that the occurrence of missings will imply zero >>>>> standard deviations with -collapse-, because it isn't true. More >>>>> importantly, as you don't give the -collapse- code you used, we are >>>>> reduced to speculation that somehow your -collapse- produced a >>>>> collapse to constants, which have 0 SD. >>>>> Nick >>>>> njcoxstata@gmail.com >>>>> >>>>> >>>>> On 27 February 2014 05:53, Ikechukwu M. <bigdoctor2004@gmail.com> wrote: >>>>>> Thanks Kieran for your response. I tried that and it gives me all >>>>>> zeros. I think it has to do with how stata treats missing values in >>>>>> the collapse command. I had seen an earlier post by Nick regarding >>>>>> this. >>>>>> >>>>>> I used bys firm : egen sd=sd(return) and I get values but they are not >>>>>> partitioned by year. It gives me one SD for all the datapoints for the >>>>>> firm. >>>>>> >>>>>> thanks >>>>>> >>>>>> On Wed, Feb 26, 2014 at 11:23 PM, Kieran McCaul >>>>>> <kieran.mccaul@uwa.edu.au> wrote: >>>>>>> ... >>>>>>> >>>>>>> Like this? >>>>>>> >>>>>>> clear * >>>>>>> >>>>>>> input firm str7 date return >>>>>>> 1 "Jan2000" 0.875 >>>>>>> 1 "Feb2000" 1.2 >>>>>>> 1 "Mar2000" 0.9 >>>>>>> 1 "Jan2001" 0.35 >>>>>>> 1 "Feb2001" 0.98 >>>>>>> 2 "Jan2000" 1.4 >>>>>>> 2 "Feb2000" .76 >>>>>>> 2 "Mar2000" 1.34 >>>>>>> end >>>>>>> >>>>>>> gen year = substr(date, 4,.) >>>>>>> >>>>>>> preserve >>>>>>> >>>>>>> collapse (sd) sd_return=return, by(firm year) >>>>>>> tempfile ttt >>>>>>> save `ttt', replace >>>>>>> >>>>>>> restore >>>>>>> >>>>>>> merge m:1 firm year using `ttt' >>>>>>> list >>>>>>> bysort firm year: summ return >>> >>>>>>> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Ikechukwu M. >>>>>>> Sent: Thursday, 27 February 2014 9:33 AM >>>>>>> To: statalist@hsphsun2.harvard.edu >>>>>>> Subject: st: generating annualized standard deviation of returns from monthly data. >>>>>>> >>>>>>> I am trying to compute standard deviation of returns for a panel data set and I am having a little difficulty. >>>>>>> >>>>>>> My data looks like this >>>>>>> >>>>>>> Firm date return >>>>>>> 1 Jan2000 0.875 >>>>>>> 1 Feb2000 1.2 >>>>>>> 1 Mar2000 0.9 >>>>>>> 1 Jan2001 0.35 >>>>>>> 1 Feb2001 0.98 >>>>>>> 2 Jan2000 1.4 >>>>>>> 2 Feb2000 .76 >>>>>>> 2 Mar2000 1.34 >>>>>>> >>>>>>> >>>>>>> I would like to compute the annualized standard deviation of returns for each firm and return one number for each firm in each year. >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/