 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Weighted Averages

 From Steven Samuels To statalist@hsphsun2.harvard.edu Subject Re: st: Weighted Averages Date Mon, 17 Jan 2011 14:07:46 -0500

```
```
You are welcome, Christopher. You will find communication easier if you use standard terms. In survey usage, "N" is population size; "n" is sample size. (In some data sets, weights have been normalized to sum to "n", a practice I don't like.) Also, the proper term for Stata's default variance estimate in regression is "linearized". Its publication (Woodruff, 1971) preceded White's publication by over 10 years. And White's estimate, I believe, applied only to a standard generated model (mean + error term), not to a finite population sampling design with weights (not to mention clusters and strata).
```
Steve

```
Woodruff, RS. 1971. A simple method for approximating the variance of a complicated estimate. Journal of the American Statistical Association : 411-414.
```

On Jan 17, 2011, at 1:34 PM, Christopher Steiner wrote:

Thanks Steve!

I noticed that with binary data, the discrepancy I received was near
the sqrt(DEFF), which I didn't realize Stata was accounting for.
Also, I did not realize that the formula I posted was the frequency
weight formula, so thanks for pointing that out.

In this particular application, the average of the weights is 1, so
summing them is equivalent to N.

This is my first time with survey data, so I'm learning fast.  I'm,
confident now that Stata's doing things "correctly," and did a little
reading of the survey book last night.

Thanks,
Christopher Paul Steiner

```
On Mon, Jan 17, 2011 at 8:04 AM, Steven Samuels <sjsamuels@gmail.com> wrote:
```Christopher:

```
After looking more closely at your formulas, typos aside, I think that you
```were trying to estimate the variance of the  mean as:

(Estimated Population Variance)/(sum of weights)

```
This would be true only if you had a simple random sample with replacement and your weights were frequency weights, not probability weights. The sum of probability weights is an estimate of N. Dividing a variance by N would ordinarily make the standard error of the mean much too small. If yours are sometimes larger than the linearized variance estimates, you probably also
```made other mistakes in the formula or your calculations.

Steve
sjsamuels@gmail.com

On Jan 16, 2011, at 3:46 PM, Steven Samuels wrote:

Christopher:

```
The variance formula you present has little relation to the true formula, whether for sampling with or without replacement. See for example page 230 of Sharon Lohr. 2009. Sampling: Design and Analysis. Boston, MA: Cengage
```Brooks/Cole.

On Jan 15, 2011, at 8:02 PM, Christopher Steiner wrote:

Hello everyone:

I am computing some basic summary statistics with weighted means from
a weighted, but otherwise simple design survey.  When I use the
following commands:

svyset [pweight=weight2]
svy: reg fcost_1

I get a weighted average of "fcost_1" that matches my hand
calculation.  I also receive White "robust" standard errors, which is
fine.  However, when I do a hand calculation of regular standard
errors using the formula:

sigma^2 = [sum(weights*(x-xbar))/sum(weights)] * (N/N-1)

and then divide by sum(weights) to get the standard error, I often
receive *larger* standard errors than the robust estimate.  Is this a
function of the pweights?  Around 10% of the values are also missing,
so is it a function of this?  Or am I doing something incorrectly?

Thank so much,
Christopher Paul Steiner

--
Christopher Paul Steiner
Third Year Grad Student, Ph.D. Economics
University of California, San Diego
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

```
```

--
Christopher Paul Steiner
Third Year Grad Student, Ph.D. Economics
University of California, San Diego
University of Illinois Alumnus, BS Mathematics & Economics
cpsteiner@gmail.com | cpsteiner@alumni.illinois.edu | c1steiner@ucsd.edu
(Note the number "1" instead of the "p" in the UCSD email address.)
<3!

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```