help maximize
-------------------------------------------------------------------------------
Title
[R] maximize -- Details of iterative maximization
Syntax
Maximum likelihood optimization
mle_cmd ... [, options]
Set default maximum iterations
set maxiter # [, permanently]
options description
-------------------------------------------------------------------------
difficult use a different stepping algorithm in
nonconcave regions
technique(algorithm_spec) maximization technique
iterate(#) perform maximum of # iterations; default is
iterate(16000)
[no]log display an iteration log of the log
likelihood; typically, the default
trace display current parameter vector in
iteration log
gradient display current gradient vector in iteration
log
showstep report steps within an iteration log
hessian display current negative Hessian matrix in
iteration log
showtolerance report the calculated result that is
compared to the effective convergence
criterion
tolerance(#) tolerance for the coefficient vector; see
Options for the defaults
ltolerance(#) tolerance for the log likelihood; Options
for the defaults
nrtolerance(#) tolerance for the scaled gradient; Options
for the defaults
nonrtolerance ignore the nrtolerance() option
from(init_specs) initial values for the coefficients
-------------------------------------------------------------------------
where algorithm_spec is
algorithm [ # [ algorithm [#] ] ... ]
algorithm is {nr | bhhh | dfp | bfgs}
and init_specs is one of
matname [, skip copy ]
{ [eqname:]name = # | /eqname = # } [...]
# [# ...], copy
Description
All Stata commands maximize likelihood functions using moptimize() and
optimize(); see Methods and formulas in [R] maximize. Commands use the
Newton-Raphson method with step halving and special fixups when they
encounter nonconcave regions of the likelihood. For details, see [M-5]
moptimize and [M-5] optimize. For more information about programming
maximum likelihood estimators in ado-files, see [R] ml and Gould,
Pitblado, and Sribney (2006).
set maxiter specifies the default maximum number of iterations for
estimation commands that iterate. The initial value is 16000, and # can
be 0 to 16000. To change the maximum number of iterations performed by a
particular estimation command, you need not reset maxiter; you can
specify the iterate(#) option. When iterate(#) is not specified, the
maxiter value is used.
Maximization options
difficult specifies that the likelihood function is likely to be
difficult to maximize because of nonconcave regions. When the
message "not concave" appears repeatedly, ml's standard stepping
algorithm may not be working well. difficult specifies that a
different stepping algorithm be used in nonconcave regions. There is
no guarantee that difficult will work better than the default;
sometimes it is better, and sometimes it is worse. You should use
the difficult option only when the default stepper declares
convergence and the last iteration is "not concave" or when the
default stepper is repeatedly issuing "not concave" messages and
producing only tiny improvements in the log likelihood.
technique(algorithm_spec) specifies how the likelihood function is to be
maximized. The following algorithms are allowed. For details, see
Gould, Pitblado, and Sribney (2006).
technique(nr) specifies Stata's modified Newton-Raphson (NR)
algorithm.
technique(bhhh) specifies the Berndt-Hall-Hall-Hausman (BHHH)
algorithm.
technique(dfp) specifies the Davidon-Fletcher-Powell (DFP) algorithm.
technique(bfgs) specifies the Broyden-Fletcher-Goldfarb-Shanno (BFGS)
algorithm.
The default is technique(nr).
You can switch between algorithms by specifying more than one in the
technique() option. By default, an algorithm is used for five
iterations before switching to the next algorithm. To specify a
different number of iterations, include the number after the
technique in the option. For example, specifying technique(bhhh 10
nr 1000) requests that ml perform 10 iterations with the BHHH
algorithm followed by 1000 iterations with the NR algorithm, and then
switch back to BHHH for 10 iterations, and so on. The process
continues until convergence or until the maximum number of iterations
is reached.
iterate(#) specifies the maximum number of iterations. When the number
of iterations equals iterate(), the optimizer stops and presents the
current results. If convergence is declared before this threshold is
reached, it will stop when convergence is declared. Specifying
iterate(0) is useful for viewing results evaluated at the initial
value of the coefficient vector. Specifying iterate(0) and from()
together allows you to view results evaluated at a specified
coefficient vector; however, not all commands allow the from()
option. The default value of iterate(#) for both estimators
programmed internally and estimators programmed with ml is the
current value of set maxiter, which is iterate(16000) by default.
log and nolog specify whether an iteration log showing the progress of
the log likelihood is to be displayed. For most commands, the log is
displayed by default, and nolog suppresses it. For a few commands
(such as the svy maximum likelihood estimators), you must specify log
to see the log.
trace adds to the iteration log a display of the current parameter
vector.
gradient adds to the iteration log a display of the current gradient
vector.
showstep adds to the iteration log a report on the steps within an
iteration. This option was added so that developers at StataCorp
could view the stepping when they were improving the ml optimizer
code. At this point, it mainly provides entertainment.
hessian adds to the iteration log a display of the current negative
Hessian matrix.
showtolerance adds to the iteration log the calculated value that is
compared with the effective convergence criterion at the end of each
iteration. Until convergence is achieved, the smallest calculated
value is reported.
shownrtolerance is a synonym of showtolerance.
-------------------------------------------------------------------------------
Below we describe the three convergence tolerances. Convergence is
declared when the nrtolerance() criterion is met and either the
tolerance() or the ltolerance() criterion is also met.
tolerance(#) specifies the tolerance for the coefficient vector. When
the relative change in the coefficient vector from one iteration to
the next is less than or equal to tolerance(), the tolerance()
convergence criterion is satisfied.
tolerance(1e-4) is the default for estimators programmed with ml.
tolerance(1e-6) is the default.
ltolerance(#) specifies the tolerance for the log likelihood. When the
relative change in the log likelihood from one iteration to the next
is less than or equal to ltolerance(), the ltolerance() convergence
is satisfied.
ltolerance(0) is the default for estimators programmed with ml.
ltolerance(1e-7) is the default.
nrtolerance(#) specifies the tolerance for the scaled gradient.
Convergence is declared when g*inv(H)*g' < nrtolerance(). The
default is nrtolerance(1e-5).
nonrtolerance specifies that the default nrtolerance() criterion be
turned off.
-------------------------------------------------------------------------------
from() specifies initial values for the coefficients. Not all estimators
in Stata support this option. You can specify the initial values in
one of three ways: by specifying the name of a vector containing the
initial values (e.g., from(b0), where b0 is a properly labeled
vector); by specifying coefficient names with the values (e.g.,
from(age=2.1 /sigma=7.4)); or by specifying a list of values (e.g.,
from(2.1 7.4, copy)). from() is intended for use when you are doing
bootstraps (see [R] bootstrap) and in other special situations (e.g.,
with iterate(0)). Even when the values specified in from() are close
to the values that maximize the likelihood, only a few iterations may
be saved. Poor values in from() may lead to convergence problems.
skip specifies that any parameters found in the specified
initialization vector that are not also found in the model be
ignored. The default action is to issue an error message.
copy specifies that the list of values or the initialization vector
be copied into the initial-value vector by position rather than
by name.
Option for set maxiter
permanently specifies that, in addition to making the change right now,
the maxiter setting be remembered and become the default setting when
you invoke Stata.
Remarks
Only in rare circumstances would you ever need to specify any of these
options, except nolog. The nolog option is useful for reducing the
amount of output appearing in log files.
Saved results
Maximum likelihood estimators save the following in e():
Scalars
e(N) number of observations; always saved
e(k) number of parameters; always saved
e(k_eq) number of equations; usually saved
e(k_eq_model) number of equations to include in a model Wald
test; usually saved
e(k_dv) number of dependent variables; usually saved
e(k_autoCns) number of base, empty, and omitted constraints;
saved if command supports constraints
e(df_m) model degrees of freedom; always saved
e(r2_p) pseudo-R-squared; sometimes saved
e(ll) log likelihood; always saved
e(ll_0) log likelihood, constant-only model; saved when
constant-only model is fit
e(N_clust) number of clusters; saved when vce(cluster
clustvar) is specified; see [U] 20.16 Obtaining
robust variance estimates
e(chi2) chi-squared; usually saved
e(p) significance of model of test; usually saved
e(rank) rank of e(V); always saved
e(rank0) rank of e(V) for constant-only model; saved when
constant-only model is fit
e(ic) number of iterations; usually saved
e(rc) return code; usually saved
e(converged) 1 if converged, 0 otherwise; usually saved
Macros
e(cmd) name of command; always saved
e(cmdline) command as typed; always saved
e(depvar) names of dependent variables; always saved
e(wtype) weight type; saved when weights are specified or
implied
e(wexp) weight expression; saved when weights are specified
or implied
e(title) title in estimation output; usually saved by
commands using ml
e(clustvar) name of cluster variable; saved when vce(cluster
clustvar) is specified; see [U] 20.16 Obtaining
robust variance estimates
e(chi2type) Wald or LR; type of model chi-squared test; usually
saved
e(vce) vcetype specified in vce(); saved when command
allows vce()
e(vcetype) title used to label Std. Err.; sometimes saved
e(opt) type of optimization; always saved
e(which) max or min; whether optimizer is to perform
maximization or minimization; always saved
e(ml_method) type of ml method; always saved by commands using
ml
e(user) name of likelihood-evaluator program; always saved
e(technique) from technique() option; sometimes saved
e(singularHmethod) m-marquardt or hybrid; method used when Hessian is
singular; sometimes saved
e(crittype) optimization criterion; always saved
e(properties) estimator properties; always saved
e(predict) program used to implement predict; usually saved
Matrices
e(b) coefficient vector; always saved
e(Cns) constraints matrix; sometimes saved
e(ilog) iteration log (up to 20 iterations); usually saved
e(gradient) gradient vector; usually saved
e(V) variance-covariance matrix of the estimators;
always saved
e(V_modelbased) model-based variance; only saved when e(V) is
neither the OIM nor OPG variance
Functions
e(sample) marks estimation sample; always saved
See Saved results in the manual entry for any maximum likelihood
estimator for a list of returned results.
Reference
Gould, W. W., J. Pitblado, and W. M. Sribney. 2006. Maximum Likelihood
Estimation with Stata. 3rd ed. College Station, TX: Stata Press.
Also see
Manual: [R] maximize
Help: [R] ml, [M-5] moptimize(), [M-5] optimize()