[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Daniel Egan <dp.egan@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: egen versus gen / generate with by |

Date |
Wed, 6 Oct 2004 17:07:03 -0400 |

This is a summary of a thread initiated under the heading : "cumulative average moving through time ". I have changed the heading in the hope that future generations may learn faster than I did. The discussion moved to my ignorance of how -egen-, -generate- and -by- work, with multiple voices explaining exactly why by sort pid (ob):gen cave = sum(calc)/ob is not the same as egen cave=sum(calc)/ob, by(pid ob) Thanks to Nick Cox, Michael Blasnick, Scott Merryman, and David Kantor for their explanations. *************************************************************** Nick Cox (as usual) wrote the bible on it: <quote> Be very careful here. You're confusing some quite different beasts. -egen- ====== -egen, sum()- fires up an -egen- function which produces totals. Under -by:- or with a -by()- option it produces group totals. You can find the code in -_gsum.ado- (-which _gsum- will find where on your machine). In essence, -egen- only takes -egen- functions, either as documented under -[R] egen-, or as user-defined -egen- functions _always_ flagged as such. Also, -egen- functions are _never, ever_ allowed anywhere else. They require -egen- absolutely. -egen- is really rather limited. There are perhaps of the order of 100 -egen- functions written, and that's a fixed menu, except insofar as if you don't like them, you can indeed write your own. -sum()- and other functions =========================== -sum()- anywhere else it is legal fires up the -sum()- function which produces cumulative sums. This is part of the executable and has been so for a very long time, perhaps even since Stata 1.0. -generate- (and -replace-) can in effect take very complicated expressions as arguments, making use of constants, variables, operators and functions such as -sum()-. The scope of -generate- is in no way indicated by the few token examples in the help. By combining constants, variables, operators and functions, you have _much_ more flexibility than with -egen-. Why then bother with -egen-? Just for convenience, that some often repeated sets of operations have been rolled into -egen- functions. by: === How to move step by: step. Stata Journal 2(1): 86-102 (2002) which gathers the main ideas in one place. The obvious alternative is to look up -by- in the Manual index and read the several sections thus indicated. The article just mentioned was written because the coverage of -by:- in the manuals is a bit fragmented. <end quote> ***************************************************************** On this note, Scott Merryman said: <quote> bysort pid (ob)- sorts pid and then ob within pid but it performs the -gen cave = sum(calc)/ob- only on pid. -bysort pid ob- would not work because it would perform the calculation on each pid and ob pair. I don't believe the –by- option in -egen- is flexible enough to interpret -egen cave=sum(calc)/ob, by(pid ob)- correctly. Also, -egen ,sum()- does not allow expressions as sum(calc)/ob. You might find Nick Cox's article "Speaking Stata: How to move step by: step" SJ 2(1) helpful. <end quote> ********************************************** Dave Kantor noted: <quote> See -help mathfun- for details. <end quote> Dan * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: st: cumulative average moving through time***From:*smerryman@kc.rr.com

**Re: st: cumulative average moving through time***From:*Daniel Egan <dp.egan@gmail.com>

**Re: st: cumulative average moving through time***From:*Daniel Lawson <dlawson1@nd.edu>

- Prev by Date:
**Re: st: Reversing an -update-** - Next by Date:
**st: RE: Fwd: Infix Command** - Previous by thread:
**Re: st: cumulative average moving through time** - Next by thread:
**Re: st: cumulative average moving through time** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |