[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: RE: Re: RE: why the different? |

Date |
Thu, 27 Oct 2005 18:36:12 +0100 |

I think Michael meant what he said, but both devices can be useful, depending on what you want to calculate. Nick n.j.cox@durham.ac.uk Wanli Zhao > Thanks a lot. BTW, I think you mean "egen > sumgt68=sum(age>68), by(gvkey)". > Muck quicker than my way. Learn something. Michael Blasnik > >I found something interesting & puzzling. Maybe I just miss > something. > >I have a dataset like this: > <snip> > > Now, I want to have the number of people older than 68 by > each gvkey. > > So I do {egen old=count(age) if age>=69,by(gvkey)}. Then I > found that > > the number is correct but it only shows when the age > variable is 69 or > > bigger. I thought it would put the same number within gvkey > for each > > age, just as I experienced a lot of such functions do. Certainly, I > > did the following: > > gsort gvkey -old > > by gvkey: replace old=old=[_n-1] if old==. > > > > That's OK. But for the outsider, I want the number of 1's > within each > > gvkey so I did {egen outside=sum(outsider), by(gvkey)}. This time, > > there is no missing value. Why the "count" behaves differently? > > Certainly, I can generate another dummy for age bigger than 68 and > > then sum that up. Same result. But I just wonder why > "count" did not > > fill in all the values? > I think this behavior can be frustrating at times, but it > certainly isn't > puzzling and I'd like to know what your examples are of other > Stata commands > that don't follow this convention. Commands that use -if- > clauses usually > only operate on observations meeting the qualifier: gen > x2=x^2 if x>5 will > create missing values in x2 for any cases where x is not > greater than . > -egen- follows this same behavior and your example with the > egen sum doesn't > have an -if- clause. I have long thought that there ought to > be an egen > option for filling in these missing values when a function > yields a constant > for each by group. Sometimes you can use logical conditions > within the > function to accomplish this, as in egen > sumgt68=sum(x*(age>68)), by(gvkey). * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: RE: Re: RE: why the different?** - Next by Date:
**st: SUR and ordered probit** - Previous by thread:
**st: RE: RE: Y axis values for hist ,density** - Next by thread:
**st: SUR and ordered probit** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |