Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: [Mata newbie] optimize() vs -ml-


From   [email protected] (William Gould, StataCorp LP)
To   [email protected]
Subject   Re: st: [Mata newbie] optimize() vs -ml-
Date   Wed, 17 Oct 2007 10:33:10 -0500

Antoine Terracol <[email protected]> did a timing comparison of Stata's
-ml- and Mata's -optimize()- and two different problems and reported
timings of

                    Stata's             Mata's
                      -ml-          -optimize()-
      -------------------------------------------
      Problem 1      .64                 .17       <- Mata faster 

      Problem 2      .36                 .69       <- Stata faster
      -------------------------------------------

which lead him to ask 

> [...] if there was any general rule to decide wether a given likelihood
> maximisation problem will be more efficiently handled using Mata's
> optimize() or Stata's -ml- command ?

There is and I have a lot to say about it.

Antoine helpfully included a log of his results allowing me to reproduce his
timings.  Along the way, Antoine also asked if (his words) his code was 
badly written.  Antoine's code was well written.


Short answer
------------

    Yes, there is a way to tell at the outset whether -ml- or -optimize()-
    will be faster.  Let's put aside the why's and wherefor's for the 
    moment.  In the case where -optimize()- is not faster than -ml-, we will
    be providing another Mata function, -moptimize()-, which will be faster
    than -ml-.  Then we will rewrite -ml- in terms of -moptimize()-, so at
    that point the performance will be the same and you can use whichever 
    appeals to you.

    These performance enhancements will be made during the Stata 10 release.


Lengthy answer
--------------

    Antoine's Problem 1 was, in -ml- jargon, a two-equation model.
    Each equation had one parameter.  When the number of equations equals 
    to the number of parameters, -optimize()- will be faster than -ml-.

    Antoine's Problem 2 was, in -ml- jargon, a one-equation model.  The one
    equation had 3 parameters.  When the number of parameters exceeds the
    number of equations, -ml- will in general be faster than -optimize()-.

    In the case where the number of parameters exceeds the number of 
    equations, one can perform a mathematical trick to reduce the 
    dimensionality of the problem.  The trick is to apply the chain rule
    in the calculation of the derivatives.

    Let's imagine we have the single equation x1*b1 + x2*b2 + x3*b3, 
    which has three paramters.  The log-likelihood function is then 

        f(x,b) = f(x1*b1 + x2*b2 + x3*b3)

    Usually, in calculating the derivatives of a three-parameter function, 
    one must separately calculate df/db1, df/db2, and df/db3.  In the 
    above case, however, we can write, 

             I = x1*b1 + x2*b2 + x3*b3

    and calculate one derivative, df/dI.  One can then obtain df/db1 as
    (df/dI)*x1, df/db2 as (df/dI)*x2, and df/db3 as (df/dI)*x3.
    This is a hugh computational savings.  Rather than calculate three 
    derivatives numerically, we calculate only one, and then use a formula 
    to map that one derivative into the three we need.  The savings is even
    greater when we get to calculating the second derivatives, where we 
    need to calculate 6 numerical derivatives in one case, and just one 
    in the second.  From that one, we can calculate the six that we need.

    That is why -ml- makes you go to all the trouble of specifying 
    models rather than just estimating each parameter separately.  -ml-
    uses that information to save subsequent calculation.

    Mata's -optimize()- has no such features.  You do not need to specify 
    equations, but you don't get the time savings when there is a time 
    savings to be had.

    Which is why we are writing -moptimize()-.  The -m- stands for model, 
    and -moptimize()- will make you specify the equations and paramters 
    just as -ml- does.  

    Now it turns out that -moptimize()- can be written in terms of 
    -optimize()-, which is why we wrote -optimize()- first.  -optimize()- 
    will do all the heavy lifting, and all -moptimize()- has to do is a 
    little bookkeeping to save in the calculations of derivatives.

    By the way, you can already see from Antoine's timings that -moptimize()-
    will be faster than -ml-.  In a single equation, 3-paramter model, 
    -ml- ran 0.63/0.36 = 1.91 times fasater than -optimize()- even through 
    -optimize()- had to make over three times as many calculations.
    
-- Bill
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index