# Re: st: [Mata newbie] optimize() vs -ml-

 From wgould@stata.com (William Gould, StataCorp LP) To statalist@hsphsun2.harvard.edu Subject Re: st: [Mata newbie] optimize() vs -ml- Date Wed, 17 Oct 2007 10:33:10 -0500

```Antoine Terracol <terracol@univ-paris1.fr> did a timing comparison of Stata's
-ml- and Mata's -optimize()- and two different problems and reported
timings of

Stata's             Mata's
-ml-          -optimize()-
-------------------------------------------
Problem 1      .64                 .17       <- Mata faster

Problem 2      .36                 .69       <- Stata faster
-------------------------------------------

> [...] if there was any general rule to decide wether a given likelihood
> maximisation problem will be more efficiently handled using Mata's
> optimize() or Stata's -ml- command ?

There is and I have a lot to say about it.

Antoine helpfully included a log of his results allowing me to reproduce his
timings.  Along the way, Antoine also asked if (his words) his code was
badly written.  Antoine's code was well written.

------------

Yes, there is a way to tell at the outset whether -ml- or -optimize()-
will be faster.  Let's put aside the why's and wherefor's for the
moment.  In the case where -optimize()- is not faster than -ml-, we will
be providing another Mata function, -moptimize()-, which will be faster
than -ml-.  Then we will rewrite -ml- in terms of -moptimize()-, so at
that point the performance will be the same and you can use whichever
appeals to you.

These performance enhancements will be made during the Stata 10 release.

--------------

Antoine's Problem 1 was, in -ml- jargon, a two-equation model.
Each equation had one parameter.  When the number of equations equals
to the number of parameters, -optimize()- will be faster than -ml-.

Antoine's Problem 2 was, in -ml- jargon, a one-equation model.  The one
equation had 3 parameters.  When the number of parameters exceeds the
number of equations, -ml- will in general be faster than -optimize()-.

In the case where the number of parameters exceeds the number of
equations, one can perform a mathematical trick to reduce the
dimensionality of the problem.  The trick is to apply the chain rule
in the calculation of the derivatives.

Let's imagine we have the single equation x1*b1 + x2*b2 + x3*b3,
which has three paramters.  The log-likelihood function is then

f(x,b) = f(x1*b1 + x2*b2 + x3*b3)

Usually, in calculating the derivatives of a three-parameter function,
one must separately calculate df/db1, df/db2, and df/db3.  In the
above case, however, we can write,

I = x1*b1 + x2*b2 + x3*b3

and calculate one derivative, df/dI.  One can then obtain df/db1 as
(df/dI)*x1, df/db2 as (df/dI)*x2, and df/db3 as (df/dI)*x3.
This is a hugh computational savings.  Rather than calculate three
derivatives numerically, we calculate only one, and then use a formula
to map that one derivative into the three we need.  The savings is even
greater when we get to calculating the second derivatives, where we
need to calculate 6 numerical derivatives in one case, and just one
in the second.  From that one, we can calculate the six that we need.

That is why -ml- makes you go to all the trouble of specifying
models rather than just estimating each parameter separately.  -ml-
uses that information to save subsequent calculation.

Mata's -optimize()- has no such features.  You do not need to specify
equations, but you don't get the time savings when there is a time

Which is why we are writing -moptimize()-.  The -m- stands for model,
and -moptimize()- will make you specify the equations and paramters
just as -ml- does.

Now it turns out that -moptimize()- can be written in terms of
-optimize()-, which is why we wrote -optimize()- first.  -optimize()-
will do all the heavy lifting, and all -moptimize()- has to do is a
little bookkeeping to save in the calculations of derivatives.

By the way, you can already see from Antoine's timings that -moptimize()-
will be faster than -ml-.  In a single equation, 3-paramter model,
-ml- ran 0.63/0.36 = 1.91 times fasater than -optimize()- even through
-optimize()- had to make over three times as many calculations.

-- Bill
wgould@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```