Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: Re: RE: why the different?

From   "Nick Cox" <>
To   <>
Subject   st: RE: RE: Re: RE: why the different?
Date   Thu, 27 Oct 2005 18:36:12 +0100

I think Michael meant what he said, but both 
devices can be useful, depending on what 
you want to calculate. 


Wanli Zhao

> Thanks a lot. BTW, I think you mean "egen 
> sumgt68=sum(age>68), by(gvkey)".
> Muck quicker than my way. Learn something.

Michael Blasnik
> >I found something interesting & puzzling. Maybe I just miss 
> something. 
> >I  have a dataset like this:
> <snip>
> > Now, I want to have the number of people older than 68 by 
> each gvkey. 
> > So I do {egen old=count(age) if age>=69,by(gvkey)}. Then I 
> found that 
> > the number is correct but it only shows when the age 
> variable is 69 or 
> > bigger. I thought it would put the same number within gvkey 
> for each 
> > age, just as I experienced a lot of such functions do. Certainly, I 
> > did the following:
> > gsort gvkey -old
> > by gvkey: replace old=old=[_n-1] if old==.
> >
> > That's OK. But for the outsider, I want the number of 1's 
> within each 
> > gvkey so I did {egen outside=sum(outsider), by(gvkey)}. This time, 
> > there is no missing value. Why the "count" behaves differently? 
> > Certainly, I can generate another dummy for age bigger than 68 and 
> > then sum that up. Same result. But I just wonder why 
> "count" did not 
> > fill in all the values?
> I think this behavior can be frustrating at times, but it 
> certainly isn't
> puzzling and I'd like to know what your examples are of other 
> Stata commands
> that don't follow this convention.  Commands that use -if- 
> clauses usually
> only operate on observations meeting the qualifier: gen 
> x2=x^2 if x>5  will
> create missing values in x2 for any cases where x is not 
> greater than .
> -egen- follows this same behavior and your example with the 
> egen sum doesn't
> have an -if- clause.  I have long thought that there ought to 
> be an egen
> option for filling in these missing values when a function 
> yields a constant
> for each by group.  Sometimes you can use logical conditions 
> within the
> function to accomplish this, as in egen 
> sumgt68=sum(x*(age>68)), by(gvkey).

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index