Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

AW: st: Spss's aggregate vs stata's collapse.

From   "Kaulisch, Marc" <[email protected]>
To   <[email protected]>
Subject   AW: st: Spss's aggregate vs stata's collapse.
Date   Wed, 13 Apr 2011 17:51:49 +0200


SPSS looks inconsistent here. A look in the tutorial gives a clear insight in the mess:
"SPSS automatically rounds weighted
frequencies to the nearest integer. This rounding is done by default on the total weighted frequency,
not on individual weights." (p. 15)
"Rounding off these decimals is not an indifferent matter. Both 0.80 and 1.45 will be rounded to 1,
which kills the very purpose of weighting. Suppose the weight of some cases is 0.30. If one case
with weight 0.30 appears in a cell, the weighted (and rounded) total will be zero cases in that cell.
If a second cell includes two cases from the same stratum, their weighted total will be 0.60,
rounded to one case." (p.16)

And in some procedures like CROSSTAB (Stata tab) you can turn off rounding the weights ;-)

Really strange this...


-----Ursprüngliche Nachricht-----
Von: [email protected] [mailto:[email protected]] Im Auftrag von Ulrich Kohler
Gesendet: Mittwoch, 13. April 2011 15:20
An: [email protected]
Betreff: Re: st: Spss's aggregate vs stata's collapse.

Am Mittwoch, den 13.04.2011, 12:13 +0100 schrieb Brendan Halpin:
> On Wed, Apr 13 2011, Amadou DIALLO wrote:
> > Brendan, Uli,
> > Thanks for answers. Yes, it has to do with weights. Removing it 
> > yields same results. Apparently SPSS rounds non-integer weight to 
> > the nearest integer (the total weighted frequency, not individual weights (sic!):
> >
> SPSS is doing the wrong thing here, then. 
> > I've tried Brendan's solution but this is not working. So far, I 
> > can't duplicate results and am stuck. Will continue checking.
> If you really need to duplicate the results, you need to replicate 
> SPSS's "error". It may be enough to round the weight yourself.

I know that this is not a SPSS list, however I'm still puzzled about what "rounding to the nearest integer" here really means. If a sampling weight has been rescaled such that the sum of weights is equal to the number of observations, there will be quite a number of weights below 0.5. Are they "rounded" to 0 then, meaning to drop them from the analysis? Or is zero not an integer value? Or do we use, the geometric mean or  harmonic mean between two subsequent numbers as the threshold for rounding, or what. 

SPSS = Some petty Statistical Software?


*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index