Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[no subject]

1. In a well-designed probability sample, the sum of non-normalized
probability weights is an estimate of population
size. In the sampling literature, the standard symbol for
population size is "N", and the symbol for sample size is "n". So, Rita's
usage is correct.

2. -aweights- are _not_ restricted to integers.

See for an
interesting thread about the relationship of analytic weights and
probability weights.


On Oct 29, 2012, at 6:15 PM, JVerkuilen (Gmail) wrote:

On Mon, Oct 29, 2012 at 5:44 PM, Rita Luk <> wrote:
> Hi Jverkulien,
> I am slow to pick up what you say. For example my data has a sample size of only 5  obs, with the sanmpling wt variable wtpp:
> caseid     wtpp
> 1.               60.74
> 2.     700.38
> 3.            139.64
> 4.    9671.57
> 5.    1545.32
> Sum of wtpp= N= 12117.65
> According to what you said, what does the analytical weight look like?   In addition to  being  normalized to sum to N, does the aweight need  to be integer?

In the case you discuss I think what would happen is that each of the
wtpp numbers would be divided by 12117.65 and then multiplied by 5.
The number you list as N is not N. N is 5.

Usually aweights are used when you have several means and their
sampling variances and want to generate an average mean weighted by
sampling variance. The sampling variances have each means N built in.
Hopefully if I'm wrong someone will chime in.

Date: Mon, 29 Oct 2012 17:11:59 -0400
From: "JVerkuilen (Gmail)" <>
Subject: Re: st: aweight

I have only what you saw but my guess is as follows:

(1) Start with a vector a which gives the analytic weights, and n from
the sample.
(2) Generate a vector w = a/sum(a) , which is normalized to sum to 1.
(3) Generate a vector f = n*w, which has the weighting structure of
vector a, but is rescaled to sum to n and thus can be used as if they
were frequencies.

On Mon, Oct 29, 2012 at 4:47 PM, Rita Luk <> wrote:
> Hi Statalist,
> Where can I find the computation detail of analytical weights (aweight) ?
> In User guide 20.22.2, it says : If you specify aweights, they are: 1. Normalized to sum to N and then    2. Insert in the ... as fweights.
> What does it mean (in formula) to normalize the weight to sum to N?  Where can I find the formula for the normalization.
> I am working on point estimates of descriptive statisitcs (mean, median and histogram) using survey data and not concern about the variance at the moment. In particular, I want to  use non-svy commands with weights (let's leave the svy commands for now). I read comments from Steven Joel Hirsch Samuels and Austin Nichols and know that I will arrive at same weighted mean,median or histogram using either of the following 2 methods:
>        Suppose I have a survey data set with sampling  weight variable wtpp (the sum of wtpp over the entire sample equals the total population of the target population)
> 1.  converted sampling wt to integer and use as freq weight:  gen double myfw=round(wtpp*100),  then tabstat xvar [fw=myfw], s(mean median)
> 2.  Use aweight :  tabstat xvar [aw=wtpp], s(mean median)
> I know these methods give same estimates. But do why they give same estimates? Hence my question on the formula of normalization of weight given wtpp?
> I appreciate if any one can help me on this.
> Rita Luk|University of Toronto|33 Russell St. T5. Toronto,ON. M5S2S1. Canada|T: (416) 535-8501 x4727

This email has been scanned by the CAMH Email Security System.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index