Steve Samuels <sjsamuels@gmail.com>

statalist@hsphsun2.harvard.edu

Re: st: Standard error of the weighted mean

Sat, 4 May 2013 19:04:07 -0400

Nick: > On May 3, 2013, at 1:03 PM, nick bungy wrote: > > > I am having issue finding the correct way to calculate the standard error of a probability weighted mean. I can successfully calculate the sample standard deviation, as taken from: > > > s2 = {n/[W(n - 1)]} sum wi (xi - xbar)2 > > The Stata article suggests that the standard error of the weighted mean is then simply: > > V_srs = s2/n > > Where V_srs is an ‘estimate of the variance of muhat [mean] assuming a simple random sample of the same number of observations’ > The article doesn't suggest that at all. The first formula shows how to estimate the population variance from a weighted sample. The second formula gives the variance (squared SE) for the mean of a simple random sample without replacement a mean that is not weighted. As the article states, the standard error for the weighted mean "is pretty complicated; see [SVY] variance estimation for details." Formula 1 (p. 180) of the Stata 12 Survey manual, specified to a single stratum, shows that the formula involves the square of the weights and has no connection or the population variance or to the formula for s2 above. Steve > However, when I calculate V_srs and compare it to the linearized standard error as taken from the svy: commands, the linearized standard error is significantly larger. > > Since my calculations and Stata agree on s^2, presumably I must be calculating V_srs (variance of the mean) incorrectly. > > Can anyone shed some light on where I am going wrong? Here's an example of the disparity between my mata code and the official svy command. > > sysuse auto > gen pweight = trunk^2 + 1 > > mata > st_view(y=. ,. , "mpg") > st_view(weight =. ,., "pweight") > > Weighted_mean = colsum(y:*weight) / colsum(weight) > Weighted_var = rows(y)/(colsum(weight)*(rows(y)-1))*colsum(weight:*(y :- Weighted_mean):^2) > Weighted_sd = sqrt(Weighted_var) > SD_Mean = sqrt(Weighted_var / rows(y)) > > Weighted_sd > SD_Mean > > end > > svyset [pweight = pweight] > svy: mean mpg > estat sd > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

