Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: RE: generating annualized standard deviation of returns from monthly data. |

Date |
Thu, 27 Feb 2014 17:13:21 +0000 |

You are using the name -year- but that name is wildly misleading here. The values of year are, it seems, individual daily dates. The key point of -collapse, by(firm year)- is how many observations there are for each _distinct_ combination of -firm- and -year-. For your sample data shown here all the groups are represented by _single_ observations, with the result explained earlier, the SD is returned as missing (because sample size - 1 is 0). You have to produce a true "year" variable for what you want to work, e.g. by using -yofd()-. Nick njcoxstata@gmail.com On 27 February 2014 16:56, Ikechukwu M. <bigdoctor2004@gmail.com> wrote: > Thank you. > > here is what I get when I perform either of the two commands. > > I agree that without the year grouping variable there should be one sd > returned per firm. It is including the year grouping variable that > messes things up. > > > year tic return sd_return > 78. 31jan2000 0183B -10.71428571 . > 79. 29feb2000 0183B 48 . > 80. 31mar2000 0183B -29.72972973 . > ------------------------------------------------ > 81. 30apr2000 0183B 7.692307692 . > 82. 31may2000 0183B -17.85714286 . > 83. 30jun2000 0183B 39.13043478 . > 84. 31jul2000 0183B -18.75 . > 85. 31aug2000 0183B 61.53846154 . > ------------------------------------------------ > 86. 30sep2000 0183B -33.33333333 . > 87. 31oct2000 0183B 14.28571429 . > 88. 30nov2000 0183B -18.75 . > 89. 31dec2000 0183B -7.692307692 . > 90. 31jan2001 0183B 37.5 . > ------------------------------------------------ > 91. 28feb2001 0183B -27.27272727 . > 92. 31mar2001 0183B 50 . > 93. 30apr2001 0183B -18.22222222 . > 94. 31may2001 0183B 25 . > 95. 30jun2001 0183B -6.086956522 . > ------------------------------------------------ > 96. 31jul2001 0183B -20.83333333 . > 97. 31aug2001 0183B 2.339181287 . > 98. 30sep2001 0183B -22.85714286 . > 99. 31oct2001 0183B 39.25925926 . > 100. 30nov2001 0183B -20.21276596 . > ------------------------------------------------ > 101. 31dec2001 0183B -.6666666667 . > 102. 31jan2002 0183B 9.395973154 . > 103. 28feb2002 0183B 0 . > 104. 31jan2000 0223B 0 . > 105. 29feb2000 0223B 5.551515152 . > ------------------------------------------------ > 106. 31mar2000 0223B 1.447178003 . > 107. 30apr2000 0223B .4279600571 . > 108. 31may2000 0223B 0 . > 109. 31jan2000 0226B 0 . > 110. 29feb2000 0226B 0 . > ------------------------------------------------ > 111. 31mar2000 0226B 0 . > 112. 30apr2000 0226B 0 . > 113. 31may2000 0226B 800 . > 114. 30jun2000 0226B -33.33333333 . > 115. 31jul2000 0226B 0 . > ------------------------------------------------ > 116. 31aug2000 0226B 0 . > > > This result is obtained from bysort firm year: egen SD=sd(return) > > Thanks again. > > IK > > On Thu, Feb 27, 2014 at 10:47 AM, Nick Cox <njcoxstata@gmail.com> wrote: >> If you don't specify the year as a grouping variable, then values for >> different years are lumped together; that is precisely as it should >> be. >> >> Otherwise, I can't make sense of the claim that you get missing for SD >> with (e.g.) 6 non-missing values. -collapse- produces a missing SD if >> all values (or all but one) values are missing in a group, but not >> otherwise. (The "all but one" follows from the use of (n - 1) rather >> than n in the formula for SD, n being sample size as usual.) >> >> If you were expecting that missing values would be omitted from the >> -collapse- results, that expectation was incorrect. >> >> To make clear your perceived problem, we need to see data and output, >> e.g. for examples like that below. >> >> . clear >> >> . input firm year return >> >> firm year return >> 1. 1 2000 0.875 >> 2. 1 2000 1.2 >> 3. 1 2000 0.9 >> 4. 1 2000 0.35 >> 5. 1 2000 0.98 >> 6. 1 2000 1.4 >> 7. 1 2000 . >> 8. 1 2000 . >> 9. 1 2000 . >> 10. 1 2000 . >> 11. 1 2000 . >> 12. 1 2000 . >> 13. 1 2001 . >> 14. 1 2001 . >> 15. end >> >> . collapse (sd) return, by(firm year) >> >> . list >> >> +------------------------+ >> | firm year return | >> |------------------------| >> 1. | 1 2000 .3560957 | >> 2. | 1 2001 . | >> +------------------------+ >> >> Nick >> njcoxstata@gmail.com >> >> >> On 27 February 2014 15:28, Ikechukwu M. <bigdoctor2004@gmail.com> wrote: >>> Thanks. Apologies for incorrect attribution to Nick Cox. What I meant >>> to say is that occurrence of missing values collapses to a missing, >>> even though I expected the missings to be ignored. >>> Thanks for the input - I have implemented what you both suggest and >>> the good news is that it resolves to the same thing so it is working >>> but not producing the desired output. I am ending up with missing >>> values even for firms that have 6 monthly observations for the year. >>> >>> The collapse code I used is this: >>> collapse (sd) sd_return=return, by(firm year) >>> >>> using bysort firm year: egen SD=sd(return) >>> >>> but when I omit the year, sd is appropriately computed but for all 10 >>> years of the data, not partitioned into years. >>> >>> When I include the year, I end up with lots of missing observations. >>> >>> Thanks >>> >>> On Thu, Feb 27, 2014 at 4:21 AM, Nick Cox <njcoxstata@gmail.com> wrote: >>>> There are various "Nick"s around here. In my case, I wouldn't offer >>>> the explanation that the occurrence of missings will imply zero >>>> standard deviations with -collapse-, because it isn't true. More >>>> importantly, as you don't give the -collapse- code you used, we are >>>> reduced to speculation that somehow your -collapse- produced a >>>> collapse to constants, which have 0 SD. >>>> Nick >>>> njcoxstata@gmail.com >>>> >>>> >>>> On 27 February 2014 05:53, Ikechukwu M. <bigdoctor2004@gmail.com> wrote: >>>>> Thanks Kieran for your response. I tried that and it gives me all >>>>> zeros. I think it has to do with how stata treats missing values in >>>>> the collapse command. I had seen an earlier post by Nick regarding >>>>> this. >>>>> >>>>> I used bys firm : egen sd=sd(return) and I get values but they are not >>>>> partitioned by year. It gives me one SD for all the datapoints for the >>>>> firm. >>>>> >>>>> thanks >>>>> >>>>> On Wed, Feb 26, 2014 at 11:23 PM, Kieran McCaul >>>>> <kieran.mccaul@uwa.edu.au> wrote: >>>>>> ... >>>>>> >>>>>> Like this? >>>>>> >>>>>> clear * >>>>>> >>>>>> input firm str7 date return >>>>>> 1 "Jan2000" 0.875 >>>>>> 1 "Feb2000" 1.2 >>>>>> 1 "Mar2000" 0.9 >>>>>> 1 "Jan2001" 0.35 >>>>>> 1 "Feb2001" 0.98 >>>>>> 2 "Jan2000" 1.4 >>>>>> 2 "Feb2000" .76 >>>>>> 2 "Mar2000" 1.34 >>>>>> end >>>>>> >>>>>> gen year = substr(date, 4,.) >>>>>> >>>>>> preserve >>>>>> >>>>>> collapse (sd) sd_return=return, by(firm year) >>>>>> tempfile ttt >>>>>> save `ttt', replace >>>>>> >>>>>> restore >>>>>> >>>>>> merge m:1 firm year using `ttt' >>>>>> list >>>>>> bysort firm year: summ return >> >>>>>> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Ikechukwu M. >>>>>> Sent: Thursday, 27 February 2014 9:33 AM >>>>>> To: statalist@hsphsun2.harvard.edu >>>>>> Subject: st: generating annualized standard deviation of returns from monthly data. >>>>>> >>>>>> I am trying to compute standard deviation of returns for a panel data set and I am having a little difficulty. >>>>>> >>>>>> My data looks like this >>>>>> >>>>>> Firm date return >>>>>> 1 Jan2000 0.875 >>>>>> 1 Feb2000 1.2 >>>>>> 1 Mar2000 0.9 >>>>>> 1 Jan2001 0.35 >>>>>> 1 Feb2001 0.98 >>>>>> 2 Jan2000 1.4 >>>>>> 2 Feb2000 .76 >>>>>> 2 Mar2000 1.34 >>>>>> >>>>>> >>>>>> I would like to compute the annualized standard deviation of returns for each firm and return one number for each firm in each year. >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: RE: generating annualized standard deviation of returns from monthly data.***From:*"Ikechukwu M." <bigdoctor2004@gmail.com>

**References**:**st: generating annualized standard deviation of returns from monthly data.***From:*"Ikechukwu M." <bigdoctor2004@gmail.com>

**st: RE: generating annualized standard deviation of returns from monthly data.***From:*Kieran McCaul <kieran.mccaul@uwa.edu.au>

**Re: st: RE: generating annualized standard deviation of returns from monthly data.***From:*"Ikechukwu M." <bigdoctor2004@gmail.com>

**Re: st: RE: generating annualized standard deviation of returns from monthly data.***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: RE: generating annualized standard deviation of returns from monthly data.***From:*"Ikechukwu M." <bigdoctor2004@gmail.com>

**Re: st: RE: generating annualized standard deviation of returns from monthly data.***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: RE: generating annualized standard deviation of returns from monthly data.***From:*"Ikechukwu M." <bigdoctor2004@gmail.com>

- Prev by Date:
**Re: st: RE: generating annualized standard deviation of returns from monthly data.** - Next by Date:
**st: xtgee for panel data** - Previous by thread:
**Re: st: RE: generating annualized standard deviation of returns from monthly data.** - Next by thread:
**Re: st: RE: generating annualized standard deviation of returns from monthly data.** - Index(es):