[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Wanli Zhao" <zhaowl@temple.edu> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: Re: RE: why the different? |

Date |
Thu, 27 Oct 2005 11:55:52 -0400 |

Thanks a lot. BTW, I think you mean "egen sumgt68=sum(age>68), by(gvkey)". Muck quicker than my way. Learn something. Cheers, Wanli Zhao -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Michael Blasnik Sent: Thursday, October 27, 2005 8:13 AM To: statalist@hsphsun2.harvard.edu Subject: st: Re: RE: why the different? "Wanli Zhao" <zhaowl@temple.edu> wrote: >I found something interesting & puzzling. Maybe I just miss something. >I have a dataset like this: <snip> > Now, I want to have the number of people older than 68 by each gvkey. > So I do {egen old=count(age) if age>=69,by(gvkey)}. Then I found that > the number is correct but it only shows when the age variable is 69 or > bigger. I thought it would put the same number within gvkey for each > age, just as I experienced a lot of such functions do. Certainly, I > did the following: > gsort gvkey -old > by gvkey: replace old=old=[_n-1] if old==. > > That's OK. But for the outsider, I want the number of 1's within each > gvkey so I did {egen outside=sum(outsider), by(gvkey)}. This time, > there is no missing value. Why the "count" behaves differently? > Certainly, I can generate another dummy for age bigger than 68 and > then sum that up. Same result. But I just wonder why "count" did not > fill in all the values? > > Cheers, > Wanli Zhao I think this behavior can be frustrating at times, but it certainly isn't puzzling and I'd like to know what your examples are of other Stata commands that don't follow this convention. Commands that use -if- clauses usually only operate on observations meeting the qualifier: gen x2=x^2 if x>5 will create missing values in x2 for any cases where x is not greater than . -egen- follows this same behavior and your example with the egen sum doesn't have an -if- clause. I have long thought that there ought to be an egen option for filling in these missing values when a function yields a constant for each by group. Sometimes you can use logical conditions within the function to accomplish this, as in egen sumgt68=sum(x*(age>68)), by(gvkey). Michael Blasnik michael.blasnik@verizon.net * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Re: RE: why the different?***From:*"Michael Blasnik" <michael.blasnik@verizon.net>

- Prev by Date:
**st: RE: RE: Y axis values for hist ,density** - Next by Date:
**st: RE: RE: Re: RE: why the different?** - Previous by thread:
**st: Re: RE: why the different?** - Next by thread:
**Re: st: Generating Data Structure To Estimate Multinomial Logit via Programming** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |