Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: Understanding the difference between gen and egen

From   "Nick Cox" <>
To   <>
Subject   RE: st: RE: Understanding the difference between gen and egen
Date   Wed, 14 Jun 2006 09:14:28 +0100

In addition, consider this: -egen- is not 
in fact pushed hard by StataCorp and in particular
not pushed hard in documentation aimed largely 
at new users. For example, it is not indexed in 
[GSW] or [U] and if mentioned in either it is
presumably only in passing. 

Conversely, if -egen- is being pushed hard at
new users, it is largely in documentation written
by users themselves. For example, the book by
Sophia Rabe-Hesketh and Brian Everitt often
mentions -egen-.  

This is important because Dick's main concern
is how confusing matters are for new users. 

There are several associated issues. One is perhaps 
fairly described as mixed feelings within StataCorp
about the importance of -egen-. Part of the 
argument, I think, echoing Uli's point and 
indeed one made by Dick, is that if a function is 
coding something fundamental (e.g. logarithms, 
calculations for Gaussian distributions) it will
be needed in a variety of contexts. Also, it
should be part of the executable and not code to 
be interpreted. 

I am not knocking -egen-. I may be its greatest 
fan in the Stata community. I have found it helpful
in my own work to be able to use, and to write -egen- 
functions. But it is not, itself, an enormous deal. It is at root 
a rag-bag of convenient short-cuts. For example, 
if -egen, total()- did not exist, we could get
there typically in two lines instead of one, 
as in 

bysort bar: egen foototal = total(foo)

being done by 

bysort bar : gen foototal = sum(foo) 
by bar: replace foototal = foototal[_N] 


> -----Original Message-----
> From:
> []On Behalf Of 
> Ulrich Kohler
> Sent: 14 June 2006 08:58
> To:
> Subject: Re: st: RE: Understanding the difference between gen and egen
> > Dick Campbell
> > > This all may be an oversight or a failure on my part to see
> > > the obvious, but
> > > in general, I find the distinction between -gen- and -egen-
> > > to be confusing.
> > > It would seem logical that all of this stuff could be handled
> > > by -gen-. But
> > > -gen- is not accessible to users, being a built in command,
> > > while -egen- is and
> > > various users have added various things to it. Thus, I guess
> > > the reason for
> > > -egen- is that it is user accessible, not that it has some
> > > special status
> > > relative
> > > to -gen-. To a new user, however, and even to more 
> experienced ones,
> > > this is all a bit confusing.
> Besides the accessibility to end users, there is another 
> difference between 
> the two. -egen- function are, well, egen-functions. They are 
> available only 
> for -egen- and at no other place in Stata world. -gen-, on 
> the other hand, 
> expects an expression behind the equal sign (gen varname = 
> <exp>). The 
> expression can be any Stata function, and have their place at 
> quite some 
> other places of the Stata world (if <exp>, local macname = 
> <exp>, `=<exp>', 
> twoway function y = <exp>, if <exp> { ; else if <exp>). 
> As a consequence, if you learn to generate a specific 
> variable with a specific 
> egen function, you learn how to generate a specific variable 
> with that 
> specific -egen- function. If you learn to generate the same 
> variable with 
> -gen- and Stata-functions you learn something that can be 
> also used outside 
> the specific problem. This clearly does not help new users with their 
> confusion. However when teaching Stata one should not forget it. 
> BTW there is a similar situation with macros. The standard 
> definition of 
> macros is -local macname = <exp>-, and there is a second list 
> of functions 
> that are only available for macros, namely the extended macro 
> function. 
> Syntax here is -local macname: <extended fcn>-. In direct 
> analogy, one might 
> bond -gen- with -egen- by using -gen varname = <exp>-, and 
> -gen varname : 
> <egen-fcn>. I don't know whether this would be less confusing 
> for new users. 
> In addition this brings the problem of continuity and 
> consistency back in. 

*   For searches and help try:

© Copyright 1996–2022 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index