Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Weighted Averages

From	Steven Samuels <[email protected]>
To	[email protected]
Subject	Re: st: Weighted Averages
Date	Mon, 17 Jan 2011 14:07:46 -0500

You are welcome, Christopher. You will find communication easier ifyou use standard terms. In survey usage, "N" is population size; "n"is sample size. (In some data sets, weights have been normalized tosum to "n", a practice I don't like.) Also, the proper term forStata's default variance estimate in regression is "linearized". Itspublication (Woodruff, 1971) preceded White's publication by over 10years. And White's estimate, I believe, applied only to a standardgenerated model (mean + error term), not to a finite populationsampling design with weights (not to mention clusters and strata).


 Steve

Woodruff, RS. 1971. A simple method for approximating the variance ofa complicated estimate. Journal of the American StatisticalAssociation : 411-414.




On Jan 17, 2011, at 1:34 PM, Christopher Steiner wrote:

Thanks Steve!

I noticed that with binary data, the discrepancy I received was near
the sqrt(DEFF), which I didn't realize Stata was accounting for.
Also, I did not realize that the formula I posted was the frequency
weight formula, so thanks for pointing that out.

In this particular application, the average of the weights is 1, so
summing them is equivalent to N.

This is my first time with survey data, so I'm learning fast.  I'm,
confident now that Stata's doing things "correctly," and did a little
reading of the survey book last night.

Thanks,
Christopher Paul Steiner

On Mon, Jan 17, 2011 at 8:04 AM, Steven Samuels <[email protected]>wrote:

Christopher:

After looking more closely at your formulas, typos aside, I thinkthat you

were trying to estimate the variance of the  mean as:

(Estimated Population Variance)/(sum of weights)

This would be true only if you had a simple random sample withreplacementand your weights were frequency weights, not probability weights.The sumof probability weights is an estimate of N. Dividing a variance by Nwouldordinarily make the standard error of the mean much too small. Ifyours aresometimes larger than the linearized variance estimates, youprobably also

made other mistakes in the formula or your calculations.

Steve
[email protected]

On Jan 16, 2011, at 3:46 PM, Steven Samuels wrote:

Christopher:

The variance formula you present has little relation to the trueformula,whether for sampling with or without replacement. See for examplepage 230of Sharon Lohr. 2009. Sampling: Design and Analysis. Boston, MA:Cengage

Brooks/Cole.


On Jan 15, 2011, at 8:02 PM, Christopher Steiner wrote:

Hello everyone:

I am computing some basic summary statistics with weighted means from
a weighted, but otherwise simple design survey.  When I use the
following commands:

svyset [pweight=weight2]
svy: reg fcost_1

I get a weighted average of "fcost_1" that matches my hand
calculation.  I also receive White "robust" standard errors, which is
fine.  However, when I do a hand calculation of regular standard
errors using the formula:

sigma^2 = [sum(weights*(x-xbar))/sum(weights)] * (N/N-1)

and then divide by sum(weights) to get the standard error, I often
receive *larger* standard errors than the robust estimate.  Is this a
function of the pweights?  Around 10% of the values are also missing,
so is it a function of this?  Or am I doing something incorrectly?

Thank so much,
Christopher Paul Steiner

--
Christopher Paul Steiner
Third Year Grad Student, Ph.D. Economics
University of California, San Diego
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/




--
Christopher Paul Steiner
Third Year Grad Student, Ph.D. Economics
University of California, San Diego
University of Illinois Alumnus, BS Mathematics & Economics
[email protected] | [email protected] | [email protected]
(Note the number "1" instead of the "p" in the UCSD email address.)
<3!

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Weighted Averages
  - From: Christopher Steiner <[email protected]>
- Re: st: Weighted Averages
  - From: Steven Samuels <[email protected]>
- Re: st: Weighted Averages
  - From: Steven Samuels <[email protected]>
- Re: st: Weighted Averages
  - From: Christopher Steiner <[email protected]>

Prev by Date: Re: st: Weighted Averages
Next by Date: Re: st: moptimize routine that works with quantile regression?
Previous by thread: Re: st: Weighted Averages
Next by thread: st: firthlogit
Index(es):
- Date
- Thread