Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: collapse (sum) versus egen (sum)


From   Ben.Kriechel@roa.unimaas.nl
To   statalist@hsphsun2.harvard.edu
Subject   RE: st: collapse (sum) versus egen (sum)
Date   Thu, 19 Aug 2004 16:22:08 +0200

Thanks Dan, 

I will look into it. I was also VERY surprised to see that there is a
difference between the two commands. I considered them -- and I still
believe they are intended -- to be equivalent. 

--- Ben

--------------------------------------------

Ben Kriechel

Research Centre for Education
and the Labour Market
<Ben.Kriechel@roa.unimaas.nl> 

-----Original Message-----
From: Dan Blanchette [mailto:dan_blanchette@unc.edu] 
Sent: donderdag 19 augustus 2004 16:13
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: collapse (sum) versus egen (sum)

Perhaps more information about the discrpency you noticed would be helpful?

-egen- and -collapse- both use -generate-'s sum function.

The only difference I see is that -collapse- generates the new variable as
datatype double where -egen- let's you choose, though the default is float.
If you choose a datatype that cannot handle the resulting sum, then you
would end up with missing values and thus a different result than you would
with -collapse-.  If your sums are larger than floats maximum
1.70141173319*10^36
then you would get varying results.  Try your program again choosing
datatype
double with -egen- to see if this fixes the problem.

To view the ado files for these commands you can download -adoedit- from
the SSC or:

. findfile collapse.ado

. view "`r(fn)'"

. findfile egen.ado

. view "`r(fn)'"

or more specifically:

. findfile _gsum.ado

. view "`r(fn)'"


Dan Blanchette
Applications Analyst Programmer
Carolina Population Center UNC-CH




> While aggregating a dataset using collapse some strange results were
> obtained:
>
> collapse (sum) aantal, by(opl114)
>
> Did not give the same results as the same dataset gave in other programs.
>
> Checking with
>
> egen oplantal=sum(aantal),by(opl114)
>
> though gave exactly the same (correct) number that other programs gave me.
> Can somebody explain to me how the summation (could) differ between
collapse
> and egen?
>
> Thanks.
> --------------------------------------------
>
> Ben Kriechel
>
> Research Centre for Education
> and the Labour Market
> <Ben.Kriechel@roa.unimaas.nl>
>
>
>


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index