Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: collapse (sum) versus egen (sum)

From   Dan Blanchette <[email protected]>
To   [email protected]
Subject   Re: st: collapse (sum) versus egen (sum)
Date   Thu, 19 Aug 2004 10:12:30 -0400 (EDT)

Perhaps more information about the discrpency you noticed would be helpful?

-egen- and -collapse- both use -generate-'s sum function.

The only difference I see is that -collapse- generates the new variable as
datatype double where -egen- let's you choose, though the default is float.
If you choose a datatype that cannot handle the resulting sum, then you
would end up with missing values and thus a different result than you would
with -collapse-.  If your sums are larger than floats maximum 1.70141173319*10^36
then you would get varying results.  Try your program again choosing datatype
double with -egen- to see if this fixes the problem.

To view the ado files for these commands you can download -adoedit- from
the SSC or:

. findfile collapse.ado

. view "`r(fn)'"

. findfile egen.ado

. view "`r(fn)'"

or more specifically:

. findfile _gsum.ado

. view "`r(fn)'"

Dan Blanchette
Applications Analyst Programmer
Carolina Population Center UNC-CH

> While aggregating a dataset using collapse some strange results were
> obtained:
> collapse (sum) aantal, by(opl114)
> Did not give the same results as the same dataset gave in other programs.
> Checking with
> egen oplantal=sum(aantal),by(opl114)
> though gave exactly the same (correct) number that other programs gave me.
> Can somebody explain to me how the summation (could) differ between collapse
> and egen?
> Thanks.
> --------------------------------------------
> Ben Kriechel
> Research Centre for Education
> and the Labour Market
> <[email protected]>

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index