Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Re: Weighted number of observations


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Re: Weighted number of observations
Date   Tue, 3 Aug 2004 15:34:00 +0100

This isn't a -tabstat- issue as such. -tabstat- 
just passes the buck to -summarize-, which 
behaves in the same way. 

The issue is delicate, but hinges, I surmise, 
on this distinction. When -summarize- (e.g.) _uses_ 
the sum of the weights, it rescales first. 
As quoted, [U] 14.1.6 includes the expression 
"when it uses them", which thus appears not 
ornamental, but crucial. 

When -summarize- _displays_ the sum of the 
weights, it displays the unscaled sum.

Nick 
n.j.cox@durham.ac.uk 

Friedrich Huebler
 
> Dear Toyoto,
> 
> The auto data has 22 observations with foreign=1, not 50950. In [R]
> tabstat we read: "aweights and fweights are allowed; see [U] 14.1.6
> weight." [U] 14.1.6 states that most Stata commands rescale the
> aweights to sum to N. However, -tabstat- does not rescale the
> weights.  The fact that -tabstat- treats aweights the same way as
> fweights is not clearly documented. One could also argue that this
> behavior is inconsistent.
> 
> Friedrich Huebler
> 
> --- Toyoto Iwata <iwata@med.akita-u.ac.jp> wrote:
> > Dear Friedrich Huebler
> > 
> > You wrote about
> > 
> > > . tabstat foreign [aw=weight], stat(sum)
> > >
> > > variable | sum
> > > -------------+----------
> > > foreign | 50950
> > > ------------------------
> > 
> > Perhaps I miss the point, but,
> > 
> > .gen eachmean = foreign*weight
> > 
> > .gen sumeachmean = sum(eachmean)  /* I don't know the mean of this.
> > */
> > 
> > . list sumeachmean in l
> > 
> >      +----------+
> >      | sumeac~n |
> >      |----------|
> >  74. |    50950 |
> >      +----------+
> > 
> > This seems to agree with the definition of the aweight.
> > 
> > [Online help says,]
> > 
> > aweights, or analytic weights, are weights that are inversely 
> > proportional to the variance of an observation; i.e., 
> > the variance of the j-th observation is assumed to be sigma^2/w_j, 
> > where w_j are the weights.  
> > Typically, the observations represent averages and the weights are 
> > the number of elements that gave rise to the average.  
> > 
> > 
> > > The Stata User's Guide states in section 14.1.6: "For most Stata
> > > commands, the recorded scale of aweights is irrelevant; Stata
> > > internally rescales them to sum to N, the number of observations
> > in
> > > your data, when it uses them." It would be useful if the Stata
> > > documentation could make clear which commands don't use aweights
> > in this manner.
> > > 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index