 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Weighted Averages

 From Christopher Steiner To statalist@hsphsun2.harvard.edu Subject Re: st: Weighted Averages Date Mon, 17 Jan 2011 10:34:41 -0800

```Thanks Steve!

I noticed that with binary data, the discrepancy I received was near
the sqrt(DEFF), which I didn't realize Stata was accounting for.
Also, I did not realize that the formula I posted was the frequency
weight formula, so thanks for pointing that out.

In this particular application, the average of the weights is 1, so
summing them is equivalent to N.

This is my first time with survey data, so I'm learning fast.  I'm,
confident now that Stata's doing things "correctly," and did a little
reading of the survey book last night.

Thanks,
Christopher Paul Steiner

On Mon, Jan 17, 2011 at 8:04 AM, Steven Samuels <sjsamuels@gmail.com> wrote:
> Christopher:
>
> After looking more closely at your formulas, typos aside,  I think that you
> were trying to estimate the variance of the  mean as:
>
> (Estimated Population Variance)/(sum of weights)
>
> This would be true only if you had a simple random sample with replacement
> and your weights were frequency weights, not probability weights.  The sum
> of probability weights is an estimate of N. Dividing a variance by N would
> ordinarily make the standard error of the mean much too small. If yours are
> sometimes larger than the linearized variance estimates, you probably also
>
> Steve
> sjsamuels@gmail.com
>
> On Jan 16, 2011, at 3:46 PM, Steven Samuels wrote:
>
> Christopher:
>
> The variance formula you present has little relation to the true formula,
> whether for sampling with or without replacement.  See for example page 230
> of  Sharon Lohr. 2009. Sampling: Design and Analysis. Boston, MA: Cengage
> Brooks/Cole.
>
>
> On Jan 15, 2011, at 8:02 PM, Christopher Steiner wrote:
>
> Hello everyone:
>
> I am computing some basic summary statistics with weighted means from
> a weighted, but otherwise simple design survey.  When I use the
> following commands:
>
> svyset [pweight=weight2]
> svy: reg fcost_1
>
> I get a weighted average of "fcost_1" that matches my hand
> calculation.  I also receive White "robust" standard errors, which is
> fine.  However, when I do a hand calculation of regular standard
> errors using the formula:
>
> sigma^2 = [sum(weights*(x-xbar))/sum(weights)] * (N/N-1)
>
> and then divide by sum(weights) to get the standard error, I often
> receive *larger* standard errors than the robust estimate.  Is this a
> function of the pweights?  Around 10% of the values are also missing,
> so is it a function of this?  Or am I doing something incorrectly?
>
> Thank so much,
> Christopher Paul Steiner
>
> --
> Christopher Paul Steiner
> Third Year Grad Student, Ph.D. Economics
> University of California, San Diego
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

--
Christopher Paul Steiner
Third Year Grad Student, Ph.D. Economics
University of California, San Diego
University of Illinois Alumnus, BS Mathematics & Economics
cpsteiner@gmail.com | cpsteiner@alumni.illinois.edu | c1steiner@ucsd.edu
(Note the number "1" instead of the "p" in the UCSD email address.)
<3!

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```