Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Is it possible that Stata converges to a local maximum in maximum likelihood related procedures?

From   tshmak <>
To   "" <>
Subject   RE: st: Is it possible that Stata converges to a local maximum in maximum likelihood related procedures?
Date   Tue, 11 Jun 2013 12:41:34 +0800

Thanks Gordon and Nick, 

Both of you mentioned that given that Stata intelligently seeks out suitable starting values for ml type optimization, convergence to a non-global maximum should be unlikely. But in what sense is this "unlikely"? It may seem to you unlikely. Indeed, it also seems to me unlikely. But if a mathematician questions me over the use of these procedures, are there any solid results I can present him/her? (Note: My concern is only theoretical. I wasn't even motivated by any problematic data I'm having.) It appears to me that the answer is "no" at the moment - which isn't a problem. I wrote this question specifically to find that out. 

But thanks for your comments.  


-----Original Message-----
From: [] On Behalf Of Gordon Hughes
Sent: 10 June 2013 19:33
Subject: Re: st: Is it possible that Stata converges to a local maximum in maximum likelihood related procedures?

No gradient search can guarantee that it has found a global maximum 
unless either (a) the objective function is globally concave, or (b) 
you have carried some kind of extensive grid search on starting 
values.  Some objective functions are known to be globally concave - 
e.g. quadratic functions (least squares) or the logit model - but 
many may not be.  The practical problem is that many likelihood 
functions are degenerate for some values of the parameters, so that a 
grid search over starting values may generate large numbers of failures.

As Nick points out, Stata's maximisation procedures (including -ml-) 
contain many safeguards both to avoid pathological results and to 
reduce the chances of converging to a local rather than a global 
maximum, but both can still occur.  If you are worried about this in 
a particular case, it is usually sensible to start from a restricted 
version of the model which is known to be globally concave.  That way 
it is likely, though not certain, that a gradient search which starts 
from the global maximum of a restricted model will head in the right 
direction when dealing with a less restricted version of the model.

Most Stata -ml- procedures adopt this strategy as it is much better 
than, say, starting with a vector of 0's.  But you should always take 
account of the specific features of the likelihood function to 
improve the chances of finding a global maximum in the most general 
case.  Partitioning the set of parameters and using a concentrated 
likelihood function - i.e. multi-step estimation where some 
parameters are estimated conditional on prior values of other 
parameters - is classic example of that approach.

Gordon Hughes

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index