Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: speed question: -collapse- vs -egen-


From   "Stas Kolenikov" <skolenik@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: speed question: -collapse- vs -egen-
Date   Fri, 25 Apr 2008 16:46:11 -0500

NJC can offer a precise answer, but my take would be

gen byte one = 1
bys group: gen varmean = sum(mean)/sum(one)
by group: keep if _n==_N
keep whatever

Topics like those should've been covered somewhere in Nick's column in
Stata Journal, or in Stata tips. -egen- is slow as it does a lot of
checks and parsing and stuff -- for big processing jobs, single-liners
like above are always notably faster. -collapse- should be at least a
tad faster than -egen-, but again I would expect it to lose to the
above code.

On Fri, Apr 25, 2008 at 2:37 PM, Jeph Herrin <junk@spandrel.net> wrote:
>
>  I'm optimizing some code that needs to run often
>  for a simulation, and am wondering if I should
>  expect any difference in processing time between
>
>   bys group: egen varmean=mean(myvar)
>   bys group: keep if _n==1
>   keep group varmean
>
>  and
>
>   collapse (mean) varmean=myvar, by(group)
>
>  and if so, which would be faster?
>
>  I know I could run some tests myself, but figured
>  that others had either already done so or at least
>  would have some insight.
>




-- 
Stas Kolenikov, also found at http://stas.kolenikov.name

Small print: Please do not reply to my Gmail address as I don't check
it regularly.
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index