Why do I get different results when running a ml procedure on Stata/SE and
Stata/MP?
|
Title
|
Difference in calculations for Stata versions and processors
|
|
|
Author
|
Theresa Boswell, StataCorp
|
|
Date
|
January 2008
|
Users may encounter slightly different results among different
versions or flavors of Stata. These slightly different results may occur when
using an estimation command that calls the
ml command
or when different numbers of processors are used in Stata/MP when using the
ml command directly.
These differences are very small and can be ignored because,
statistically, the results do not differ. The possible
reasons for the slight difference are explained below.
Stata/SE vs. Stata/MP
Slight differences in results can arise on the same computer between
different versions of an application, even if you run the same command in
different versions. Depending on factors such as the operating system
version, the processor in the computer, and the compiler used to produce the
application, numerical calculations may use different mathematics libraries
or may begin an algorithm at slightly different initial values. This may
result in very small differences in results from the ml command.
Stata/MP and number of processors
When more than one processor is used in Stata/MP, the computations for the
likelihood are split into pieces (one piece for each processor) and then are
added at the end of the calculation on each iteration. Because of round-off
error, addition is not associative in computer science as it is in
mathematics. This may cause a slight difference in results. For example,
a1+a2+a3+a4 can produce different results from (a1+a2)+(a3+a4) in numerical
computation. When changing the number of processors used in Stata, the
order in which the results from each processor are combined in calculations
may not be the same depending on which processor completes its calculations
first.
To summarize, you should treat SE, MP with one processor, MP with two
processors, etc., as different machines—even if they are on the same
physical machine—when you want to reproduce results exactly.
When trying to reproduce results, you should use the same operating
system, application version, and processor type. This is particularly true
when using any estimation command that uses the ml command. If this is
not possible, you can set the convergence tolerance of the maximization
lower than the default value of 1e-5 by specifying the nrtolerance()
option to aid in reproducibility.
|