[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: collapse if at least X or more obs. per group are non-missing
the stata help provides information about the collapse syntax. There:
collapse clist [...]
where clist is either
[(stat)] varlist [ [(stat)] ... ]
[(stat)] target_var=varname [target_var=varname ...] [ [(stat)] ...]
If you choose (which at least your syntax does) the first option, the data will
be collapsed using the variables in <varlist> as both, the new and old variable
Thus, collapse (count) GDP (mean) GDP, by(sftgcode decade), tries to write
both, the number of nonmissing observations and their mean of GDP into the
variable GDP. This will obviously not work.
So you have to choose the second <clist> syntax; in your example you should
collapse (count) GDP_N=GDP (mean) GDP_mea=GDP, by(sftgcode decade)
Keep in mind that stata allows only variable names with 8 characters (that's
why GDP_mea and not GDP_mean).
> . collapse (count) GDP (mean) GDP, by(sftgcode decade)
> GDP = (count) GDP
> GDP = (mean) GDP
> name conflict
> Does this mean that I cannot get means and counts for the same variable at
> the same time when using collapse? Moreover, is there any way to directly
> aggregate annual obs to decadal country averages while omitting those
> averages for which a pre-specified number of obs. is missing per country?
> I wasn't able to find any solution to this on the archives, although I
> assume it's a rather common problem. Thank you very much for your help.
> Jens Hainmueller
* For searches and help try: