[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: speed question: -collapse- vs -egen-

From   "Stas Kolenikov" <>
Subject   Re: st: speed question: -collapse- vs -egen-
Date   Fri, 25 Apr 2008 16:46:11 -0500

NJC can offer a precise answer, but my take would be

gen byte one = 1
bys group: gen varmean = sum(mean)/sum(one)
by group: keep if _n==_N
keep whatever

Topics like those should've been covered somewhere in Nick's column in
Stata Journal, or in Stata tips. -egen- is slow as it does a lot of
checks and parsing and stuff -- for big processing jobs, single-liners
like above are always notably faster. -collapse- should be at least a
tad faster than -egen-, but again I would expect it to lose to the
above code.

On Fri, Apr 25, 2008 at 2:37 PM, Jeph Herrin <> wrote:
>  I'm optimizing some code that needs to run often
>  for a simulation, and am wondering if I should
>  expect any difference in processing time between
>   bys group: egen varmean=mean(myvar)
>   bys group: keep if _n==1
>   keep group varmean
>  and
>   collapse (mean) varmean=myvar, by(group)
>  and if so, which would be faster?
>  I know I could run some tests myself, but figured
>  that others had either already done so or at least
>  would have some insight.

Stas Kolenikov, also found at

Small print: Please do not reply to my Gmail address as I don't check
it regularly.
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index