Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: question about egen

From   Gary Longton <>
Subject   Re: st: question about egen
Date   Thu, 17 Apr 2003 23:14:47 -0700

Shige Song wrote:

> I am trying to use "egen newvar = count()" to generate a set of variables
> indicating frequency of old variables. The syntax is (as stated in the
> Reference manual):
>         egen nwear = count(exp)
> I was wondering what this "(exp)" means (there is no example for this
> particular type of egen).
> For example, I have variable GENDER (1: men, 2: women), CITY(a, b,
> c,d,e,f). I want to generate variables that show 1) number of men in each
> city, 2) number of women in each city, and 3) total number of people in
> each city. So I type:
>         sort CITY
>         by CITY: egen nm=count(GENDER==1)
>         by CITY: egen nw=count(GENDER==2)
>         by CITY: egen np=count(GENDER)
> Stata generates all three variables with complains, but surprisingly, all
> three new generated variables are exactly identical (all equal the total
> number of people)! Can anyone please give me a hand? Thank you very much!

The "(exp)" just indicates that Stata is looking for a valid Stata
expression here.  The logical expressions you have used are valid.

The problem is that the egen count() function is not doing what you
might logically expect: it is not counting the number of observations
for which the expression is "true".  Rather it is counting the number of
observations for which the expression evaluates to a non-missing result
(look closely at either the help or the manual for the egen count
function.).  When the logical expression evaluates to "false" (i.e.
zero) the result is nevertheless non-missing, and is thus "counted".

However, you should be able to achieve the result you want using the
egen -sum()- function using an argument which is a logical expression
evaluating to either 1 (true) or 0 (false) as in your example.

	egen nm = sum(GENDER == 1), by(CITY)
	egen nw = sum(GENDER == 2), by(CITY)
        egen np = sum(GENDER ~= .), by(CITY)

- Gary
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index