» Home » Resources & support » Certification results

**Explanation****Summary of results****Certification results: univariate summary statistics****Certification results: linear regression****Certification results: analysis of variance****Certification results: nonlinear regression**

The National Institute of Standards and Technology (NIST) writes,

In response to industrial concerns about the numerical accuracy of computations from statistical software, the Statistical Engineering and Mathematical and Computational Sciences Divisions of NIST's Information Technology Laboratory are providing datasets with certified values for a variety of statistical methods.

These datasets are known as the NIST StRD—Standard Reference Data.
See the ** NIST StRD web
page** for detailed descriptions of these datasets and tests.

Below are presented the results of running these tests on Stata.

In reporting comparisons, it is popular to report the LRE—the log
relative error. Let *c* represent a calculated result and *t* the
answer supplied by NIST. The formal definition of this comparison is

- LRE = min( 15, -log10(|
*c*-*t*|/*t*) ) if |*t*|!=0 - LRE = min( 15, -log10(|
*c*-*t*|) ) otherwise.

The result of this calculation is then called "Digits of Accuracy" or, more precisely, "Decimal Digits of Accuracy"; it counts the number of digits in common with the true value (higher values are obviously better). Note that LRE cannot exceed 15.

Results were obtained May 2, 2005, running Stata 9 for Linux (console version) on a computer with an AMD Opteron processor and the Fedora Core 2 Linux operating system. Results will differ slightly on other platforms because of compiler and hardware differences; Stata runs the same numerical code on all platforms.

**Univariate summary statistics:**- Stata completed all tests. Means were estimated with never less than 15 digits of accuracy. Standard deviations averaged 13.3 correct digits, ranging from 8.3 to 15 digits. The lag-1 autocorrelation averaged 13.8 correct digits, ranging from 10.7 to 15 digits.

**Linear regression:**- Stata completed all tests except one, the Filippelli test.

For the other tests, coefficients averaged 10.2 correct digits and never had fewer than 6.4 correct digits. Standard errors averaged 13.2 correct digits (minimum 10.8), and residuals sums of squares averaged 14.3 correct digits (minimum 12.7).

In the Filippelli test, Stata found two coefficients so collinear that it dropped them from the analysis. Most other statistical software packages have done the same thing, and most authors have interpreted this result as acceptable for this test.

**Analysis of variance:**- Stata completed all tests. The F statistic averaged 12.8 correct digits
and never had fewer than 10.2 correct digits.

The above results include a correction made by us to three of the tests. An error in the construction of these three tests makes ANOVA routines implemented in binary double precision appear less precise than they are. The data, as originally presented, are accurate to only a few digits with the result that F statistics can be calculated only to a few digits. The correction is described below.

**Nonlinear regression:**- Stata completed all tests. Coefficients averaged 7.8 correct digits and never had fewer than 4.7 correct digits. Standard errors averaged 5.8 correct digits and never had fewer than 3.3 correct digits. Residual sums of squares averaged 10.9 correct digits and never had fewer than 3.0 correct digits.

Detailed results for each of the tests are provided below.

Stata Digits of accuracy ----------------------- lag-1 Test Difficulty mean S.D. autocorr. -------------------------------------------------------- PiDigits lower15.0 15.0 14.9log do-file Lottery lower15.0 15.0 15.0log do-file Lew lower15.0 15.0 14.8log do-file Mavro lower15.0 13.1 13.7log do-file Michelson lower15.0 13.8 13.4log do-file NumAcc-1 lower15.0 15.0 15.0log do-file NumAcc-2 average15.0 15.0 15.0log do-file NumAcc-3 average15.0 9.5 11.9log do-file NumAcc-4 higher15.0 8.3 10.7log do-file -------------------------------------------------------- Average15.0 13.3 13.8Minimum15.0 8.3 10.7Maximum15.0 15.0 15.0

Stata Digits of accuracy ----------------------- Test Difficulty Ceof. S.E. RSS ------------------------------------------------------- Norris lower12.8 13.5 13.3log do-file Pontius lower11.5 13.0 12.7log do-file NoInt-1 average14.7 15.0 14.9log do-file NoInt-2 average15.0 15.0 14.7log do-file Filippelli higher no full solution(*) log do-file Longley higher12.1 12.9 13.2log do-file Wampler-1 higher6.9 15.0 15.0log do-file Wampler-2 higher9.7 15.0 15.0log do-file Wampler-3 higher6.5 10.8 14.1log do-file Wampler-4 higher6.5 10.8 15.0log do-file Wampler-5 higher6.4 10.8 15.0log do-file ----------------------------------------------------------- Average10.2 13.2 14.3Minimum6.4 10.8 12.7Maximum15.0 15.0 15.0-----------------------------------------------------------

Each test involved multiple independent variables. Reported under Coef. and S.E. is the minimum LRE for all regressors, including the intercept, if any. RSS reports the LRE for the residual (error) sums of squares.

(*) Filippelli test: Stata found the variables so collinear that it dropped
two of them—that is, it set two coefficients and standard errors to
zero. The resulting estimates still fit the data well. Most other
statistical software packages have done the same thing, and most authors
have interpreted this result as acceptable for this test. Stata has an
**orthpoly** command that can do this problem, but it would not occur to
most users to use it, and transforming results back to the metric of the
problem requires an extra statement. However, if that command is used, the
LRE for the coefficients is 8.4 and the LRE for the RSS is 8.5.

Stata Digits of accuracy ------------------ Test Difficulty F ---------------------------------------------------- Si Resistivity lower13.1log do-file Simon-Lesage 1 lower14.9log do-file Simon-Lesage 2 lower13.7log do-file Simon-Lesage 3 lower13.1log do-file Ag Atomic Wt average10.2log do-file Simon-Lesage 4 average10.4log do-file Simon-Lesage 5 average10.2log do-file Simon-Lesage 6 average10.2log do-file Simon-Lesage 7 higher4.4(*) log do-file 7b higher15.0(*) log do-file Simon-Lesage 8 higher4.2(*) log do-file 8b higher15.0(*) log do-file Simon-Lesage 9 higher4.2(*) log do-file 9b higher15.0(*) log do-file ---------------------------------------------------- Average excluding S-L 7, 8, 912.8Minimum10.2Maximum15.0----------------------------------------------------

(*)
**Tests Simon–Lesage 7b through 9b** are a variation developed by
Stata on tests Simon–Lesage 7 through 9. To our knowledge, no package
that stores and processes data in binary double precision has ever done
better than 4.6 on these tests, and that is because it is not possible to do
better; the problem is with the test, not the packages being tested. The
difficulty is that that data are made different from what the authors
intended the instant they are stored on a double-precision binary computer.
The test uses y values, such as 1,000,000,000,000.4, but that value
immediately becomes 1,000,000,000,000.40002441... because of how computers
store numbers. We strongly suspect that the answer Stata produces, and the
answers produced by other packages, are correct given the data stored.

Tests Simon–Lesage 7b through 9b are modifications of Simon–Lesage 7 through 9, the difference being that the data are multiplied by 10 before being input, so 1,000,000,000,000.4 becomes 10,000,000,000,004, a number that can be stored with perfect accuracy. The test is then carried through, the question being whether the ANOVA routine can deal with data that varies only in the trailing digits.

Stata Digits of accuracy ---------------------- Test Difficulty Coef. S.E. RSS ---------------------------------------------------- Misra 1a lower9.4 6.4 10.5log do-file Chwirut 2 lower8.0 6.3 11.2log do-file Chwirut 1 lower7.6 6.3 11.4log do-file Lanczos 3 lower7.2 6.0 10.6log do-file Gauss 1 lower8.5 6.3 11.6log do-file Gauss 2 lower8.2 5.9 10.6log do-file Daniel Wood lower8.6 6.2 11.7log do-file Misra 1b lower9.9 6.5 11.3log do-file Kirby 2 average8.0 6.3 11.6log do-file Hahn 1 average7.1 5.1 10.6log do-file Nelson average7.1 5.2 10.9log do-file MGH 17 average(7.0) (6.1) (11.5)log do-file Lanczos 1 average10.6 3.3 3.0log do-file Lanczos 2 average7.9 5.4 10.1log do-file Gauss 3 average8.2 5.5 11.0log do-file Misra 1c average9.7 6.5 11.1log do-file Misra 1d average9.3 6.5 11.2log do-file Roszman 1 average7.4 6.4 12.2log do-file ENSO average4.7 5.3 11.3log do-file MGH 09 higher(7.0) (6.5) (11.6)log do-file Thurber higher6.5 5.4 11.3log do-file BoxBOD higher7.3 6.7 10.4log do-file Ratkowsky 2 higher7.6 6.0 11.8log do-file MGH 10 higher(7.7) (4.7) (11.4)log do-file Eckerle4 higher(8.3) (6.4) (10.7)log do-file Ratkowsky 3 higher(6.0) (5.0) (11.4)log do-file Bennett 5 higher6.4 5.9 11.0log do-file ---------------------------------------------------- Average7.8 5.8 10.9Minimum4.7 3.3 3.0Maximum10.6 6.7 12.2----------------------------------------------------

Parentheses indicate that convergence could not be achieved with the first set of starting values and that the second set had to be used.

Each test involved multiple independent variables. Reported under Coef. and S.E. is the minimum LRE for all regressors, including the intercept, if any. RSS reports the LRE for the residual (error) sums of squares.