# Re: st: median of consecutive groups - avoiding loops

 From daniel klein To statalist@hsphsun2.harvard.edu Subject Re: st: median of consecutive groups - avoiding loops Date Thu, 12 May 2011 02:23:29 +0200

```This question is indeed interesting. Ad hoc simulation shows, that the
answer seems to depend on the number of groups. While the loop
performs well, if the number of groups is small (10), it slows
considerably down if number of groups increase (100). The speed of the
"egen" solution does not seem to depend on number of groups (all runs
with N=10,000). Guess Stata did a good job writing the -by- prefix.
Simulations have equal group sizes. Overall it seems "egen solution"
outperforms the loop.

Would be interesting if one could speed things up using Mata (as I
would expect). But then again, I guess in "real life" the differences
will not matter much.

Here's the simulation (syntax is -ahsim obs number_of_groups-).

cap prog drop ahsim

prog ahsim
args obs ngroups
if "`obs'" == "" loc obs 10000
if "`ngroups'" == "" loc ngroups 10
clear all
qui {
set obs `ngroups'
g group = _n
expand `obs'/`ngroups'
sort group
g value = rnormal()
}
di _n "{txt}Groups: `groups'"
di "{txt}Obs." _N

timer clear

timer on 1
su group, meanonly
local last = r(max) - 1

qui gen mymedian = .

qui forval i = 1/`last' {
local j = `i' + 1
su value if inlist(group, `i', `j') , detail
replace mymedian = r(p50) if group == `i'
}
timer off 1

timer on 2
g int newgroup1 = cond(mod(group, 2), group, group-1)
g int newgroup2 = cond(mod(group, 2), group-1, group)
bys newgroup1 : egen med1 = median(value)
bys newgroup2 : egen med2 = median(value)
g median = cond(mod(group, 2), med1, med2)
drop newgroup1 newgroup2 med1 med2
timer off 2

timer list

di _n "{txt}1: loop"
di "{txt}2: egen"
end

Best
Daniel
```