From
"Kaulisch, Marc" <kaulisch@forschungsinfo.de>

To
<statalist@hsphsun2.harvard.edu>

Subject
AW: st: Spss's aggregate vs stata's collapse.

Date
Wed, 13 Apr 2011 17:51:49 +0200

Uli, SPSS looks inconsistent here. A look in the tutorial gives a clear insight in the mess: "SPSS automatically rounds weighted frequencies to the nearest integer. This rounding is done by default on the total weighted frequency, not on individual weights." (p. 15) "Rounding off these decimals is not an indifferent matter. Both 0.80 and 1.45 will be rounded to 1, which kills the very purpose of weighting. Suppose the weight of some cases is 0.30. If one case with weight 0.30 appears in a cell, the weighted (and rounded) total will be zero cases in that cell. If a second cell includes two cases from the same stratum, their weighted total will be 0.60, rounded to one case." (p.16) And in some procedures like CROSSTAB (Stata tab) you can turn off rounding the weights ;-) Really strange this... Marc -----Ursprüngliche Nachricht----- Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Ulrich Kohler Gesendet: Mittwoch, 13. April 2011 15:20 An: statalist@hsphsun2.harvard.edu Betreff: Re: st: Spss's aggregate vs stata's collapse. Am Mittwoch, den 13.04.2011, 12:13 +0100 schrieb Brendan Halpin: > On Wed, Apr 13 2011, Amadou DIALLO wrote: > > > Brendan, Uli, > > Thanks for answers. Yes, it has to do with weights. Removing it > > yields same results. Apparently SPSS rounds non-integer weight to > > the nearest integer (the total weighted frequency, not individual weights (sic!): > > www.spsstools.net/Tutorials/WEIGHTING.pdf > > SPSS is doing the wrong thing here, then. > > > I've tried Brendan's solution but this is not working. So far, I > > can't duplicate results and am stuck. Will continue checking. > > If you really need to duplicate the results, you need to replicate > SPSS's "error". It may be enough to round the weight yourself. I know that this is not a SPSS list, however I'm still puzzled about what "rounding to the nearest integer" here really means. If a sampling weight has been rescaled such that the sum of weights is equal to the number of observations, there will be quite a number of weights below 0.5. Are they "rounded" to 0 then, meaning to drop them from the analysis? Or is zero not an integer value? Or do we use, the geometric mean or harmonic mean between two subsequent numbers as the threshold for rounding, or what. SPSS = Some petty Statistical Software? Uli * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

