___ ____ ____ ____ ____ (R) /__ / ____/ / ____/ ___/ / /___/ / /___/ 13.0 Copyright 1985-2013 StataCorp LP Statistics/Data Analysis StataCorp 4905 Lakeway Drive College Station, Texas 77845 USA 800-STATA-PC http://www.stata.com 979-696-4600 stata@stata.com 979-696-4601 (fax) 3-user Stata compute server perpetual license: Serial number: 999 Licensed to: Brian Poi StataCorp LP Notes: 1. Command line editing disabled 2. Stata running in batch mode running /home/bpp/bin/profile.do ... . do sl9b.do . . /* NIST StRD benchmark from http://www.nist.gov/itl/div898/strd/ > *** MODIFIED by William Gould, StataCorp. > *** > *** The dependent variable in the original test had values such as > *** as 1000000000000.4, 1000000000000.3, etc. These numbers cannot be > *** stored on a binary computer with more than 4 digits of accuracy. > *** E.g., in double precision, > *** > *** 1000000000000.4 - 1000000000000 = .40002441... > *** > *** That is, you might enter 1000000000000.4, but a double-precision binar > y > *** computer actually stores the number 1000000000000.40002441... > *** > *** Thus, even a perfectly accurate ANOVA routine could not obtain the > *** the results the authors intended because, at the instant the data was > *** stored, numbers were rounded. The test, as given, amounts to a test > *** of whether data is being stored in better than binary double precision > . > *** > *** The intention of the test, I believe, was to determine if the ANOVA > *** routine could deal with numbers that varied only in their trailing > *** digits. > *** > *** Thus, the test is modified as follows: > *** > *** 1) y is multiplied by 10. What was previously 1000000000000.4 > *** now becomes 10000000000004, a number that a digital computer > *** can store accurately in double precision. > *** > *** 2) All results should now be as the authors originally expect > *** except sums of squares of y will be multiplied by 100 and > *** the root mean square error will be mutliplied by 10. > *** > *** The remaining comments below, from the NIST data, are not modified. > > > ANOVA > > Difficulty=Higher n_i=2001 k=9 Generated > > Dataset Name: Simon-Lesage9 (Simon-Lesage9.dat) > > > Procedure: Analysis of Variance > > > Reference: Simon, Stephen D. and Lesage, James P. (1989). > "Assessing the Accuracy of ANOVA Calculations in > Statistical Software". > Computational Statistics & Data Analysis, 8, pp. 325-332. > > > Data: 1 Factor > 9 Treatments > 2001 Replicates/Cell > 18009 Observations > 13 Constant Leading Digits > Higher Level of Difficulty > Generated Data > > > Model: 10 Parameters (mu,tau_1, ... , tau_9) > y_{ij} = mu + tau_i + epsilon_{ij} > > > Certified Values: > > Source of Sums of Mean > Variation df Squares Squares F Statis > tic > > Between Treatment 8 1.60080000000000E+02 2.00100000000000E+01 > 2.00100000000000E+03 > Within Treatment 18000 1.80000000000000E+02 1.00000000000000E-02 > > Certified R-Squared 4.70712773465067E-01 > > Certified Residual > Standard Deviation 1.00000000000000E-01 > */ . . clear . . scalar N = 18009 . scalar df_r = 18000 . scalar df_m = 8 . . scalar mss = 16008 . scalar F = 2001 . scalar rss = 18000 . scalar r2 = 4.70712773465067E-01 . scalar rmse = 1 . . qui input byte treat double resp . . . anova resp treat Number of obs = 18009 R-squared = 0.4707 Root MSE = 1 Adj R-squared = 0.4705 Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 16008 8 2001 2001.00 0.0000 | treat | 16008 8 2001 2001.00 0.0000 | Residual | 18000 18000 1 -----------+---------------------------------------------------- Total | 34008 18008 1.888494 . . assert N == e(N) . assert df_r == e(df_r) . assert df_m == e(df_m) . . lrecomp e(F) F e(rmse) rmse e(r2) r2 e(mss) mss e(rss) rss e(F) e(rmse) e(r2) e(mss) e(rss) . end of do-file