"Martin Weiss" <martin.weiss1@gmx.de>

<statalist@hsphsun2.harvard.edu>

AW: AW: AW: st: AW: Diffrence between sum and tabsum

Tue, 15 Sep 2009 15:30:17 +0200

<> " I think -tabsum- should produce the same solution than summarize" Maybe this effect is related both to the "close-to-zero" thing and the OS issue. As both commands are built-in, it is difficult to look under the hood. Still, I hope you get the same answer from this more straightforward example. I certainly do... ************* sysuse auto, clear bys for: su mpg tab for, sum(mpg) ************* HTH Martin -----Ursprüngliche Nachricht----- Von: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Ulrich Kohler Gesendet: Dienstag, 15. September 2009 15:22 An: statalist@hsphsun2.harvard.edu Betreff: Re: AW: AW: st: AW: Diffrence between sum and tabsum I asked a colleague of mine to run the do-file. He got the same value as you. Seemingly, you have the Windows solution, while I have the Linux solution. I would have expected that both are the same. Leaving that aside, I think -tabsum- should produce the same solution than summarize -- which is zero (or something that is so close to zero that Stata doesn't see the difference) uli Am Dienstag, den 15.09.2009, 14:54 +0200 schrieb Martin Weiss: > <> > > For instance, the line > > ************* > tab dummy, sum(firstweight) > ************* > > returns > > > > Summary of firstweight > dummy Mean Std. Dev. Freq. > > 1 13949.146 .00021143 3 > > Total 13949.146 .00021143 3 > > > for me. The sd differs from yours... > > > HTH > Martin > > > -----Ursprüngliche Nachricht----- > Von: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Ulrich Kohler > Gesendet: Dienstag, 15. September 2009 14:43 > An: statalist@hsphsun2.harvard.edu > Betreff: Re: AW: st: AW: Diffrence between sum and tabsum > > Am Dienstag, den 15.09.2009, 13:45 +0200 schrieb Martin Weiss: > > <> > > > > Also note http://www.stata.com/support/faqs/data/float.html > > Well, yes I know that. But I still don't think that this satisfactorily > explains the values. > > > I get answers different from yours, both in Stata 10.1 and 11, > > What do you mean that you get different answers. Do you get a zero? > > > possibly b/c your dataset is not created by the code that you showed in > your reply? > > I assume this is the case as your -list- command shows the "dummy" > > although it is -generate-d in the following line... > > The dataset is created as described. I just produced the output of > -list- later and copied it to a place where I (mistakenly) thought it > makes things clearer. > > Here is one more tackle on that. The first lines of the following > do-file reproduces the behavior on more time. The second part calculates > the standard deviation "by hand" showing that one could circumvent the > problem described in the FAQ. The third part changes one bit in the > hand-made solution, which makes it coming quite close to the -tabsum- > solution. > > However, I have expected that -tabsum- works like the second part. > > ---------------------------------------------------------11.do > clear > set obs 3 > gen byte coreweight1968 = 18 > gen double firstweight = coreweight1968 * 60468000/78028 > gen dummy = 1 > tab dummy, sum(firstweight) > > gen double sum = sum(firstweight) > gen double mean = sum[_N]/3 > gen double diff = sum((firstweight - mean)^2) > di %17.16f sqrt(diff[_N]/2) > > drop sum-diff > gen float sum = sum(firstweight) > gen double mean = sum[_N]/3 > gen double diff = sum((firstweight - mean)^2) > di %17.16f sqrt(diff[_N]/2) > > ---------------------------------------------------------------- > > Here is the output of the do-file without any modifications. > > . do 11 > > . clear > > . set obs 3 > obs was 0, now 3 > > . gen byte coreweight1968 = 18 > > . gen double firstweight = coreweight1968 * 60468000/78028 > > . gen dummy = 1 > > . tab dummy, sum(firstweight) > > | Summary of firstweight > dummy | Mean Std. Dev. Freq. > ------------+------------------------------------ > 1 | 13949.146 .00024815 3 > ------------+------------------------------------ > Total | 13949.146 .00024815 3 > > . > . gen double sum = sum(firstweight) > > . gen double mean = sum[_N]/3 > > . gen double diff = sum((firstweight - mean)^2) > > . di %17.16f sqrt(diff[_N]/2) > 0.0000000000022278 > > . > . drop sum-diff > > . gen float sum = sum(firstweight) > > . gen double mean = sum[_N]/3 > > . gen double diff = sum((firstweight - mean)^2) > > . di %17.16f sqrt(diff[_N]/2) > 0.0007678068963647 > > . > . > . > . > . > . > . > . > . > . > . > end of do-file > > > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

