# st: Problem using/understanding "mlmatbysum"

 From Mike Brewer To statalist@hsphsun2.harvard.edu Subject st: Problem using/understanding "mlmatbysum" Date Wed, 14 May 2008 17:13:23 +0100

Dear list - or at least, those who have used mlmatbysum, which I imagine is a very small subset of the list,

I am trying to use mlmatbysum to code up a d2 maximum likelihood estimator for a panel data model where my data does not meet the linear-form restriction (the data is 1 observation per time period per person).

The model is not a conventional panel data model, but, as the book "Maximum Likelihood Estimation with Stata" suggests (section 4.5.3),
the Hessian matrix is such that the within-equation submatrices involve some observation-level outer-product matrices:

Sum over i=1...N of { sum over t=1..T_i of (d_it x1it'x1it) }

and some group-level outer-product matrices:

Sum over i=1...N of {
(Sum over t=1..T_i of a_it)
x (Sum over t=1..T_i of b_it x1it')
x (Sum over t=1..T_i of b_it x1it) }

and:

Sum over i=1...N of {
(Sum over t=1..T_i of a_it)
x (Sum over t=1..T_i of b_it x1it')
x (Sum over t=1..T_i of c_it x2it) }

If that makes any sense written in plain text!

i=1...N indexes people (the group index) and t indexes observations within a group (time), and there are T_i observations for individual i (see p117 of "Maximum Likelihood Estimation with Stata").

The manual advises using "mlmatbysum" to create matrices which are:

Sum over i=1...N of
{ (Sum over t=1..T_i of a_it)
x (Sum over t=1..T_i of b_it x1it')
x (Sum over t=1..T_i of b_it x1it) }

or:

Sum over i=1...N of
{ (Sum over t=1..T_i of a_it)
x (Sum over t=1..T_i of b_it x1it')
x (Sum over t=1..T_i of c_it x2it) }

But I am failing to implement this correctly, and am hoping for some tips.

I am 99% sure that I have written down the 2nd derivatives in my model correctly, as d2debug confirms that I have the Hessian correct in all of my pairs of equations except where I have to use "mlmatbysum".

Furthermore, if I artificially constrain my model so that all of the equations have only a single variables in them (with no constant) - and so all of the sub-matrices are really scalars - then I can correctly code the negative hessian by using terms formed with "egen ... sum(..), by(\$MY_panel)" and an "mlmatsum" command. I cannot manage to code the equivalent using mlmatbysum.

So all I can think of is that I am using the mlmatbysum command incorrectly.

Does anyone have any general tips or experience to share?

What I am finding is that the values for the negative Hessian that I code using mlmatsum are many many many orders of magnitude larger than the numerical ones produced by d2debug (ie N is 1000, Ti is 20 on average, the mean of the relevant X variable is 10, d2debug says the correct value might be -1000, but my code using mlmatbysum says -10^9).

many thanks, Mike

--
Mike Brewer
Programme Director, Direct Tax and Welfare
Institute for Fiscal Studies, www.ifs.org.uk, 020 72914800

******************************************************************
The Institute for Fiscal Studies is registered in London, Company number 954616, limited by guarantee.
Registered Office: 7 Ridgmount Street, London. WC1E 7AE
IFS is a registered charity, number 258815

Please note that the IFS may monitor email traffic data as well as the content of email.
******************************************************************

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/