[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: question about egen
Shige Song wrote:
> I am trying to use "egen newvar = count()" to generate a set of variables
> indicating frequency of old variables. The syntax is (as stated in the
> Reference manual):
> egen nwear = count(exp)
> I was wondering what this "(exp)" means (there is no example for this
> particular type of egen).
> For example, I have variable GENDER (1: men, 2: women), CITY(a, b,
> c,d,e,f). I want to generate variables that show 1) number of men in each
> city, 2) number of women in each city, and 3) total number of people in
> each city. So I type:
> sort CITY
> by CITY: egen nm=count(GENDER==1)
> by CITY: egen nw=count(GENDER==2)
> by CITY: egen np=count(GENDER)
> Stata generates all three variables with complains, but surprisingly, all
> three new generated variables are exactly identical (all equal the total
> number of people)! Can anyone please give me a hand? Thank you very much!
The "(exp)" just indicates that Stata is looking for a valid Stata
expression here. The logical expressions you have used are valid.
The problem is that the egen count() function is not doing what you
might logically expect: it is not counting the number of observations
for which the expression is "true". Rather it is counting the number of
observations for which the expression evaluates to a non-missing result
(look closely at either the help or the manual for the egen count
function.). When the logical expression evaluates to "false" (i.e.
zero) the result is nevertheless non-missing, and is thus "counted".
However, you should be able to achieve the result you want using the
egen -sum()- function using an argument which is a logical expression
evaluating to either 1 (true) or 0 (false) as in your example.
egen nm = sum(GENDER == 1), by(CITY)
egen nw = sum(GENDER == 2), by(CITY)
egen np = sum(GENDER ~= .), by(CITY)
* For searches and help try: