Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Aggregated Weighted Summary Statistics Using Probability Weights


From   Steve Samuels <sjsamuels@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Aggregated Weighted Summary Statistics Using Probability Weights
Date   Tue, 13 Jul 2010 22:58:32 -0400

Lindsay,

Lindsay Newman--

You do not show us actual commands or results as requested by the FAQ,

"Say exactly what you typed and exactly what Stata typed (or did) in
response. N.B. exactly! If you can, reproduce the error with one of
Stata's provided datasets or a simple concocted dataset that you
include in your posting."

In the following example, the discrepancy between  #1 and  #2,#3, and
#4 is absent.
Steve

*****************************************
sysuse auto, clear
bys foreign: sum mpg [aw =rep78]  //1
and
sum mpg if foreign==0 [aw=rep78]  //2

sysuse auto, clear
svyset _n [pweight=rep78]
svy: mean mpg if foreign==0       //3
estat sd

di sqrt(e(N) * el(e(V_srs),1,1))     //4
*****************************************



On Tue, Jul 13, 2010 at 11:17 AM, Lindsay Newman <lrshorr@gmail.com> wrote:
> I am using survey data with probability weights.  I want to compute
> various summary statistics, including the mean and standard deviation,
> of the data at an aggregated level.  In particular, I want to use
> individual responses to certain questions to calculate the country
> year weighted mean and standard deviation of the response.  For
> instance, if 200 individuals responded to a particular question, what
> is the weighted average response for that country year?  What is the
> weighted standard deviation of the responses for that country year?
>
> When I sort by country year and use the following code:
>
> (1) by countryyear: summarize (response variable) [aw=weight variable]
>
> I get different results for the standard deviations than when I either run:
>
> (2) summarize (response variable) if countryyear ==x [aw=weight variable]
>
> or when I calculate the standard deviation manually using:
>
> (3)  di sqrt(e(N) * el(e(V_srs),1,1))
>
>
> When I analyze the responses for just one country year (i.e. deleting
> all but responses from a single country year) using:
>
> (4) svy: mean (response variable) estat sd,
>
> the standard deviations match 2 and 3 but not 1.  Why is this?
>
> Thank you.
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
Steven Samuels
sjsamuels@gmail.com
18 Cantine's Island
Saugerties NY 12477
USA
Voice: 845-246-0774
Fax:    206-202-4783

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index