# Re: st: computing means and quantiles for groups using weights

 From n j cox To statalist@hsphsun2.harvard.edu Subject Re: st: computing means and quantiles for groups using weights Date Mon, 16 Jul 2007 19:47:16 +0100

Paolo Naticchioni asked something like this on various occasions:

-------------------------------------------------------------------
I want to create a new variable including the means for some groups
(strata) of the population.
For instance, assume my strata are year, gender, education.

Using egen I can compute the average of some outcome variable, for
instance wage, using the following:

bysort year gender education: egen average=mean(wage)

However, using egen I cannot use any types of weights. I would like to
use weights (weights with average 1). Do you know how can I introduce
weights in a easy way? Probably it is very easy, but I do not know the
command.

I would like also to compute some quantile of the distribution, like
the 10th quantile, the median, the 90th percentile, etc.
Also in this case, I can use egen
bysort year gender education: egen p90=pctile(wage), p(90)

However, also in this case it is not possible to use weights, the
consider the weighted distribution.
---------------------------------------------------------------------

Paolo is correct. -egen-, as documented by StataCorp, does not support weights. I guess there are three possible reasons:

1. StataCorp started out with a relatively simple syntax for
-egen- and never saw grounds for/never got round to complicating
it.

2. -egen- is a wrapper for a call to some -egen- function. In
general, some functions could sensibly be called with weights
and some not. -egen- as such has no way of knowing which is which.
That leaves the responsibility for coping with weights to individual
functions. It's messy to have the innermost function do most of the
checking and the extent of that is best reduced.

3. Something else I haven't thought of.

No matter. There are user-written -egen- functions on SSC that do what you want. Weights are just specified in a non-standard way, via options. David Kantor's -_gwtmean- is a package with a weighted mean function for -egen-. Ulrich Kohler's function -wpctile()- is in the
-egenmore- package.

If they didn't exist, other solutions are possible, but I will not
spell any out at this point.

Nick
n.j.cox@durham.ac.uk

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/