Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: Weighted number of observations


From   "Michael Blasnik" <[email protected]>
To   <[email protected]>
Subject   st: Re: Weighted number of observations
Date   Fri, 30 Jul 2004 13:58:25 -0400

This behavior does seem a little unexpected, but the result shown does match
what summarize would show for r(sum) after a weighted summarize command -- 
the weights are not normalized so the result is inflated by the mean of the
weight .  One work-around is to divide the weight by it's mean (either by
transforming the variable or just calculating it's mean)...for example:

sum weight
scalar meanw=r(mean)
tabstat foreign [aw=weight/meanw], stat(sum)

Of course, in real problems, you would need to make sure that you are
dealing with the same set of observations.  You could probably write a short
wrapper program to normalize the weight.

Michael Blasnik
[email protected]


----- Original Message ----- 
From: "Friedrich Huebler" <[email protected]>
To: <[email protected]>
Sent: Friday, July 30, 2004 1:35 PM
Subject: st: Weighted number of observations


> I am working with binary variables that take the values 0 or 1. I
> would like to generate the weighted number of observations that are
> equal to 1. The number obtained with -tabstat- is different from that
> obtained with -tab- or -collapse-. How can this difference be
> explained?
>
> . sysuse auto, clear
> . label drop origin
> . tab foreign
>
>    Car type |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           0 |         52       70.27       70.27
>           1 |         22       29.73      100.00
> ------------+-----------------------------------
>       Total |         74      100.00
>
> . tabstat foreign, stat(sum)
>
>     variable |       sum
> -------------+----------
>      foreign |        22
> ------------------------
>
> So far, so good, the unweighted number of observations=1 is the same.
> Now let's add a weight.
>
> . tab foreign [aw=weight]
>
>    Car type |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           0 | 57.1261189       77.20       77.20
>           1 | 16.8738811       22.80      100.00
> ------------+-----------------------------------
>       Total |         74      100.00
>
> . tabstat foreign [aw=weight], stat(sum)
>
>     variable |       sum
> -------------+----------
>      foreign |     50950
> ------------------------
>
> According to -tab-, the weighted number of observations=1 is about
> 16.9. According to -tabstat-, the number is 50950. With -collapse- we
> get the same result as with -tab-.
>
> . collapse (sum) foreign [aw=weight]
> . clist, noobs
>
>  foreign
>  16.8739
>
> Should the results obtained with -tab-, -tabstat-, and -collapse- not
> be the same? Many thanks for your comments.
>
> Friedrich Huebler


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index