Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: inaccurate hessian from mata: deriv()

From   Sun Yutao <>
To   <>
Subject   st: inaccurate hessian from mata: deriv()
Date   Thu, 25 Oct 2012 19:24:30 +0200

Hello users,

I¡¯m having a problem on the function derive(), it looks like it¡¯s giving a 
very inaccurate hessian. Here the problem goes:

I wrote a Stata probit log-like function and enveloped it with a Mata 
function, which does no more than compute and sum the log-likelihood 
contributions from the Stata program.
function MleLikeMata(input_param,struct mPML_model scalar model,|LogLik) {
	if (cols(input_param)!=1) {
	} else {

	for(i=1;i<=rows(model.equation);i++) {
		if (model.equation[i,4]=="1") {[.,model.offset+i]=([.,data_range],J(rows(,1,1))*param[param_range,.]
		} else {[.,model.offset+i][.,data_range]*param[param_range,.] 

	stata(model.StataLike+" "+model.XB)
	if (args()==2) {
Where model is just an extra argument I need

Then I use this code to compute the gradients and hessian:
deriv_init_search(D, "off")deriv_init_params(D, 
deriv_init_argument(D, 1, model)
grad=deriv(D, 1)'
hess=deriv(D, 2)

It looks like it's giving a more or less OK gradient but a VERY inaccurate 
hessian (I'm comparing this with stata ml command with the same data and same 
parameter vector, but -ml- works well!) , i.e. I'm 100% sure the Stata probit 
log-likelihood function is concave and hence the hessian should be 
however, it gives some hessian like this.
  1 |   238418.5803
  2 |  -119209.2902    476837.1606
  3 |  -178813.9352   -327825.5479    417232.5155
  4 |  -59604.64508   -119209.2902   -208616.2578    357627.8705
  5 |   149011.6127   -178813.9352   -59604.64508    29802.32254 
The determinant is 2.43788e+27 which means it's certainly not 
negative-definite (and you even have a 0 in the diagonal)

Possible fix I tried:

1. I tried to specify it as a v-type evaluator (one that returns a row vector)
2. I tried to scale sum of the log-likelihood by number of observations
3. change the 2nd argument in -deriv_init_search(D, "off")- to "bracket" or 
"interpolate" (which gives even worse results)

Could anyone kindly explain me what's wrong and what I should do?

And infact I think I lost a lot features (fast convergence, etc.) from 
Newton-Raphson due to the inaccurate hessian.

Oh in case you need, the Stata likelihood function is:
program define like, nclass
	qui {
		args lnf XB
		replace `lnf' = ln( 1 - normprob(`XB'))     if $ML_y1==0
		replace `lnf' = ln(     normprob(`XB'))     if $ML_y1==1
Which is the common one for probit

Best regards,
Sun Yutao

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index