Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

Re: st: Aggregated Weighted Summary Statistics Using Probability Weights

 From Steve Samuels To statalist@hsphsun2.harvard.edu Subject Re: st: Aggregated Weighted Summary Statistics Using Probability Weights Date Tue, 13 Jul 2010 22:58:32 -0400

```Lindsay,

Lindsay Newman--

You do not show us actual commands or results as requested by the FAQ,

"Say exactly what you typed and exactly what Stata typed (or did) in
response. N.B. exactly! If you can, reproduce the error with one of
Stata's provided datasets or a simple concocted dataset that you

In the following example, the discrepancy between  #1 and  #2,#3, and
#4 is absent.
Steve

*****************************************
sysuse auto, clear
bys foreign: sum mpg [aw =rep78]  //1
and
sum mpg if foreign==0 [aw=rep78]  //2

sysuse auto, clear
svyset _n [pweight=rep78]
svy: mean mpg if foreign==0       //3
estat sd

di sqrt(e(N) * el(e(V_srs),1,1))     //4
*****************************************

On Tue, Jul 13, 2010 at 11:17 AM, Lindsay Newman <lrshorr@gmail.com> wrote:
> I am using survey data with probability weights.  I want to compute
> various summary statistics, including the mean and standard deviation,
> of the data at an aggregated level.  In particular, I want to use
> individual responses to certain questions to calculate the country
> year weighted mean and standard deviation of the response.  For
> instance, if 200 individuals responded to a particular question, what
> is the weighted average response for that country year?  What is the
> weighted standard deviation of the responses for that country year?
>
> When I sort by country year and use the following code:
>
> (1) by countryyear: summarize (response variable) [aw=weight variable]
>
> I get different results for the standard deviations than when I either run:
>
> (2) summarize (response variable) if countryyear ==x [aw=weight variable]
>
> or when I calculate the standard deviation manually using:
>
> (3)  di sqrt(e(N) * el(e(V_srs),1,1))
>
>
> When I analyze the responses for just one country year (i.e. deleting
> all but responses from a single country year) using:
>
> (4) svy: mean (response variable) estat sd,
>
> the standard deviations match 2 and 3 but not 1.  Why is this?
>
> Thank you.
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

--
Steven Samuels
sjsamuels@gmail.com
18 Cantine's Island
Saugerties NY 12477
USA
Voice: 845-246-0774
Fax:    206-202-4783

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```