Nick Winter <nw53@cornell.edu> |

Re: st: collapse if at least X or more obs. per group are non-missing |

Wed, 25 Aug 2004 10:52:39 -0400 |

At 12:19 PM 8/24/2004 -0700, you wrote:

It has already been pointed out that you must give different names to your new variables with this solution.I'd like to convert a panel data set with annual country obs to one of decadal country averages, excluding those decadal averages for which 5 or more observations per country are missing within a decade. My first idea was to use a two step procedure along the lines of: 1. Run: collapse (mean) GDP (count) GDP , by(country decade) this should give me a) the decadal averages I want and b) the number of non-missing obs used to compute each of these decadal averages. 2. replace mean_GDP=. if count_GDP<5 (or whatever STATA will call these vars saving the means & counting the non-missing obs) this should set to missing those decadal averages for which 5 or more observations per country were missing within each decade.

Another way to go would be:

. bysort country decade: generate N = _N

. collapse GDP if N>5

This would be appropriate, obviously, if you only need the count in order to drop the ones you don't want; if you need it for other things then go ahead with the original twostep procedure.

