Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

RE: st: Is it possible that Stata converges to a local maximum in maximum likelihood related procedures?

 From tshmak <[email protected]> To "[email protected]" <[email protected]> Subject RE: st: Is it possible that Stata converges to a local maximum in maximum likelihood related procedures? Date Tue, 11 Jun 2013 12:41:34 +0800

```Thanks Gordon and Nick,

Both of you mentioned that given that Stata intelligently seeks out suitable starting values for ml type optimization, convergence to a non-global maximum should be unlikely. But in what sense is this "unlikely"? It may seem to you unlikely. Indeed, it also seems to me unlikely. But if a mathematician questions me over the use of these procedures, are there any solid results I can present him/her? (Note: My concern is only theoretical. I wasn't even motivated by any problematic data I'm having.) It appears to me that the answer is "no" at the moment - which isn't a problem. I wrote this question specifically to find that out.

Tim

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Gordon Hughes
Sent: 10 June 2013 19:33
To: [email protected]
Subject: Re: st: Is it possible that Stata converges to a local maximum in maximum likelihood related procedures?

No gradient search can guarantee that it has found a global maximum
unless either (a) the objective function is globally concave, or (b)
you have carried some kind of extensive grid search on starting
values.  Some objective functions are known to be globally concave -
e.g. quadratic functions (least squares) or the logit model - but
many may not be.  The practical problem is that many likelihood
functions are degenerate for some values of the parameters, so that a
grid search over starting values may generate large numbers of failures.

As Nick points out, Stata's maximisation procedures (including -ml-)
contain many safeguards both to avoid pathological results and to
reduce the chances of converging to a local rather than a global
a particular case, it is usually sensible to start from a restricted
version of the model which is known to be globally concave.  That way
it is likely, though not certain, that a gradient search which starts
from the global maximum of a restricted model will head in the right
direction when dealing with a less restricted version of the model.

Most Stata -ml- procedures adopt this strategy as it is much better
than, say, starting with a vector of 0's.  But you should always take
account of the specific features of the likelihood function to
improve the chances of finding a global maximum in the most general
case.  Partitioning the set of parameters and using a concentrated
likelihood function - i.e. multi-step estimation where some
parameters are estimated conditional on prior values of other
parameters - is classic example of that approach.

Gordon Hughes
[email protected]

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```

• References: