[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Re: mvsumm calculation time |

Date |
Tue, 5 Aug 2008 17:29:48 +0100 |

Kit is one of the authors of -mvsumm-, which can be downloaded from SSC. I am the other. My guess is that Austin's code will be faster than anything -mvsumm- or -rolling- can do. My guess is also that -- in this instance -- bringing in Mata would not help at all. There's no contradiction. -mvsumm- and (even more) -rolling- are moderately general wrapper commands that set up the machinery for a variety of calculations. It so happens that the sd of windows of 5 is a simple enough problem that you can attack it from first principles. That said, using -double-s might do no harm. Nick n.j.cox@durham.ac.uk Austin Nichols Kit and unnamed correspondent: It will be even faster to use the -by: gen- construct, since that is written in very fast C code. If you want a SD over a five-period window within firm, just do something like: tsset i t sort i t by i: g m=(y+l.y+l2.y+l3.y+l4.y)/5 by i: g v=(y-m)^2+(l.y-m)^2+(l2.y-m)^2+(l3.y-m)^2+(l4.y-m)^2 g sd=sqrt(v/4) for some existing variable y (the latter 3 commands can easily be condensed into one to further increase speed at some small cost in readability). Or am I misunderstanding the nature of the problem? On 8/5/08, Kit Baum <baum@bc.edu> wrote: > mvsumm is written in ado-file code. It probably should be rewritten to take > advantage of Mata. Since -mvsumm- was implemented, Stata added the rolling: > prefix. It might be faster to use -rolling- (which creates a separate > dataset of summary statistics when combined with -summarize-) in this case. > > > On Aug 5, 2008, at 02:33 , statalist-digest wrote: > > > I have calculated the standard deviation of firm-level revenue using the > recommended mvsumm command such as: > > > > mvsumm Revenue, stat(sd) win(5) gen(rev5ysd) end > > > > I have the 64-bit version of Stata 10 SE (and a 64-bit computer). My > sample size is over 1.1 million observations covering over 200,000 firms > over 6 years. It took my computer about 24-hours to compute this statistic > (although it worked just as advertised and gave me exactly the result I > needed). > > > > Does anyone have any recommendations to speed up computing time since I > need to compute about 8 more similar commands and don't want to tie up my > Stata for over a week? Or do I just need to accept the calculation time > since my data is so large? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Re: mvsumm calculation time***From:*Kit Baum <baum@bc.edu>

**Re: st: Re: mvsumm calculation time***From:*"Austin Nichols" <austinnichols@gmail.com>

- Prev by Date:
**Re: st: USESPSS is now available for download from SSC** - Next by Date:
**st: Return codes range** - Previous by thread:
**Re: st: Re: mvsumm calculation time** - Next by thread:
**st: Re: system estimation with dynamic panel** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |