# Re: st: Aggregated Weighted Summary Statistics Using Probability Weights

 From Steve Samuels To statalist@hsphsun2.harvard.edu Subject Re: st: Aggregated Weighted Summary Statistics Using Probability Weights Date Tue, 13 Jul 2010 22:59:37 -0400

```Please remove the "and" from the example before running!

On Tue, Jul 13, 2010 at 10:58 PM, Steve Samuels <sjsamuels@gmail.com> wrote:,
>
> Lindsay Newman--
>
> You do not show us actual commands or results as requested by the FAQ,
>
> "Say exactly what you typed and exactly what Stata typed (or did) in
> response. N.B. exactly! If you can, reproduce the error with one of
> Stata's provided datasets or a simple concocted dataset that you
>
> In the following example, the discrepancy between  #1 and  #2,#3, and
> #4 is absent.
> Steve
>
> *****************************************
> sysuse auto, clear
> bys foreign: sum mpg [aw =rep78]  //1

> sum mpg if foreign==0 [aw=rep78]  //2
>
> sysuse auto, clear
> svyset _n [pweight=rep78]
> svy: mean mpg if foreign==0       //3
> estat sd
>
> di sqrt(e(N) * el(e(V_srs),1,1))     //4
> *****************************************
>
>
>
> On Tue, Jul 13, 2010 at 11:17 AM, Lindsay Newman <lrshorr@gmail.com> wrote:
>> I am using survey data with probability weights.  I want to compute
>> various summary statistics, including the mean and standard deviation,
>> of the data at an aggregated level.  In particular, I want to use
>> individual responses to certain questions to calculate the country
>> year weighted mean and standard deviation of the response.  For
>> instance, if 200 individuals responded to a particular question, what
>> is the weighted average response for that country year?  What is the
>> weighted standard deviation of the responses for that country year?
>>
>> When I sort by country year and use the following code:
>>
>> (1) by countryyear: summarize (response variable) [aw=weight variable]
>>
>> I get different results for the standard deviations than when I either run:
>>
>> (2) summarize (response variable) if countryyear ==x [aw=weight variable]
>>
>> or when I calculate the standard deviation manually using:
>>
>> (3)  di sqrt(e(N) * el(e(V_srs),1,1))
>>
>>
>> When I analyze the responses for just one country year (i.e. deleting
>> all but responses from a single country year) using:
>>
>> (4) svy: mean (response variable) estat sd,
>>
>> the standard deviations match 2 and 3 but not 1.  Why is this?
>>
>> Thank you.
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
>
>
> --
> Steven Samuels
> sjsamuels@gmail.com
> 18 Cantine's Island
> Saugerties NY 12477
> USA
> Voice: 845-246-0774
> Fax:    206-202-4783
>

