Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: why the different?


From   "Wanli Zhao" <zhaowl@temple.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: why the different?
Date   Thu, 27 Oct 2005 03:08:22 -0400

I found something interesting & puzzling. Maybe I just miss something. I
have a dataset like this:
Gvkey   age    outsider
3311     70          1
3311     69          0
3311     65          1
3311     68          1
5455     71          0
5455     60          1
5455     65          1
5455     80          0
...

Now, I want to have the number of people older than 68 by each gvkey. So I
do {egen old=count(age) if age>=69,by(gvkey)}. Then I found that the number
is correct but it only shows when the age variable is 69 or bigger. I
thought it would put the same number within gvkey for each age, just as I
experienced a lot of such functions do. Certainly, I did the following:
gsort gvkey -old
by gvkey: replace old=old=[_n-1] if old==.

That's OK. But for the outsider, I want the number of 1's within each gvkey
so I did {egen outside=sum(outsider), by(gvkey)}. This time, there is no
missing value. Why the "count" behaves differently? Certainly, I can
generate another dummy for age bigger than 68 and then sum that up. Same
result. But I just wonder why "count" did not fill in all the values?

Cheers,
Wanli Zhao

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index