Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: moptimize gf evaluator


From   jpitblado@stata.com (Jeff Pitblado, StataCorp LP)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: moptimize gf evaluator
Date   Tue, 16 Nov 2010 08:34:31 -0600

Klaus Pforr <kpforr@googlemail.com> is using Mata's -moptimize()- function
for fitting a panel-data model, and asks about the -gf2- evaluator type:

> I posted this already a few days ago, but obviously put my question to 
> complicated to receive any answers. Is it true, that the gf type 
> evaluator for moptimize expects a score matrix of dimension L x K, with 
> L being the the number of independet elements and K being the number of 
> coefficients? I do not understand how I can derive such a matrix with 
> panel data, for which this evaluator type is recommended. Is this a 
> matrix of dL[j]/dbeta[i], with L[j] being the log likelihood 
> contribution of panel group j derived with respect to beta coeffient i?

The short answer is yes.

The following discussion show's how to implement a panel-data model using
-moptimize()-.

This kind of thing is difficult to discuss without a model to implment as an
example, so we'll use the random-effects linear regression model.

This model is described in the current -ml- book [see Gould et al. (2010),
chapter 15, section 6, page 268], and the derivatives are included in the
model description.  The -ml- book exhibits the Stata code that implements the
-d2- evaluator for this model.  We've adapted this code into a Mata function
called -rereg_gf2()- that is suitable as a -gf2- evaluator for -moptimize()-.

Other than translating the Stata code into Mata, we made the following two key
changes:

1. return the log-likelihood values at the panel level

2. return the derivatives of the panel-level log-likelihood values with
   respect to the coefficients (each regression coefficient and the two scale
   parameters)

Here is the Mata code for -rereg_gf2()-:

***** BEGIN: rereg_gf2.mata
version 11

mata:

void rereg_gf2(		scalar	M,
			real	scalar	todo,
			real	vector	b,
			real	vector	lnfj,
			real	matrix	S,
			real	matrix	H)
{
	real	scalar	s2_u
	real	scalar	s2_e

	real	matrix	info
	real	scalar	k
	real	scalar	i

	real	matrix	sub
	real	vector	z
	real	vector	S_z2
	real	vector	Sz_2
	real	vector	T
	real	vector	a
	real	matrix	X
	real	scalar	Xobs
	real	matrix	subX
	real	matrix	part
	real	vector	S_z

	real	vector	d2xb1
	real	vector	d2xb2
	real	vector	d2xb
	real	vector	d2u
	real	vector	d2e
	real	vector	dxbdu
	real	vector	dxbde
	real	vector	dude
	real	vector	lnf

	// random effects scale parameter
	s2_u	= exp(2*moptimize_util_xb(M,b,2))	// scalar
	// residual error scale parameter
	s2_e	= exp(2*moptimize_util_xb(M,b,3))	// scalar

	// use the panel identifier to get the information we need to perform
	// calculations at the panel level -- the data is assumed already
	// sorted
	info	= panelsetup(moptimize_init_userinfo(M,1), 1)
	// k is the number of panels
	k	= rows(info)

	z	= moptimize_init_depvar(M,1) :- moptimize_util_xb(M,b,1)
	S_z2	= J(k,1,.)
	Sz_2	= J(k,1,.)
	T	= J(k,1,.)
	a	= J(k,1,.)

	// compute the log-likelihood values for each panel
	for (i=1; i<=k; i++) {
		sub	= panelsubmatrix(z, i, info)
		S_z2[i] = cross(sub,sub)
		Sz_2[i] = quadsum(sub)^2
		T[i]	= info[i,2] - info[i,1] + 1
		a[i]	= s2_u / (T[i]*s2_u + s2_e)
	}
	lnfj	= -.5 * ((S_z2 :- a :* Sz_2)/s2_e :+
			 ln(T :* s2_u/s2_e :+ 1) :+
			 T:*ln(2*c("pi")*s2_e))
	if (todo == 0 | missing(lnfj)) return

	// compute the scores -- derivative of the above log-likelihood values
	// with respect to each coefficient in the model
	X	= moptimize_util_indepvars(M, 1)
	Xobs	= (rows(X) > 1)
	S	= J(k,cols(b),.)
	S_z	= J(k,1,.)
	part	= moptimize_util_eq_indices(M,1)'
	subX	= 1
	// derivatives with respect to the regression coefficients
	for (i=1; i<=k; i++) {
		sub	= panelsubmatrix(z, i, info)
		S_z[i]	= quadsum(sub)
		if (Xobs) {
			subX = panelsubmatrix(X,i,info)
		}
		part[1,1] = part[2,1] = i
		S[|part|] = quadcolsum( ((sub :- a[i]*S_z[i])/s2_e) :* subX )
	}
	// derivatives with respect to ln(s_u)
	part	= moptimize_util_eq_indices(M,2)[2,2]
	S[.,part] = a:^2 :* Sz_2/s2_u :- T :* a
	// derivatives with respect to ln(s_e)
	part++
	S[.,part] = S_z2 :/ s2_e :-
		    a :* Sz_2 :/ s2_e :-
		    (a:^2) :* Sz_2 :/ s2_u :-
		    T :+ 1 :- a :* s2_e/s2_u
	if (todo == 1 | missing(S)) return

	// compute the Hessian
	lnf	= 1
	d2u	= colsum(	2*a:^2:*Sz_2:/s2_u :-
		  		4*s2_e*a:^3:*Sz_2:/s2_u^2 :+
		  		2*T:*a:^2*s2_e/s2_u		)
	d2e	= colsum(	2*(S_z2 :- a :* Sz_2)/s2_e :-
		  		2*a:^2:*Sz_2/s2_u :-
		  		4*a:^3:*Sz_2*s2_e/s2_u^2 :+
		  		2*a*s2_e/s2_u :-
		  		2*a:^2*s2_e^2/s2_u^2		)
	dude	= colsum(	4*a:^3:*Sz_2*s2_e/s2_u^2 :-
		  		2*T:*a:^2*s2_e/s2_u		)

	dxbdu	= J(rows(z),1,.)
	dxbde	= J(rows(z),1,.)
	part	= J(2,2,1)
	for (i=1; i<=k; i++) {
		sub	= panelsubmatrix(z, i, info)
		part[1,1]	= info[i,1]
		part[2,1]	= info[i,2]
		dxbdu[|part|]	= J(T[i],1,2*a[i]^2*S_z[i]/s2_u)
		dxbde[|part|]	= 2*(sub :- a[i]*S_z[i])/s2_e :-
				  2*a[i]^2*S_z[i]/s2_u
	}
	dxbdu = moptimize_util_matsum(M, 1, 2, dxbdu, lnf)
	dxbde = moptimize_util_matsum(M, 1, 3, dxbde, lnf)

	d2xb2	= moptimize_util_matsum(M, 1, 1, 1, lnf)
	sub	= J(rows(z),1,.)
	part	= J(2,2,1)
	for (i=1; i<=k; i++) {
		part[1,1]	= info[i,1]
		part[2,1]	= info[i,2]
		sub[|part|]	= J(T[i],1,0)
		sub[info[i,2]]	= a[i]
	}
	d2xb1	= moptimize_util_matbysum(M, 1, sub, J(rows(z),1,1), lnf)
	d2xb	= (d2xb2 - d2xb1)/s2_e
	H	= -(d2xb,	dxbdu,	dxbde	\
		    dxbdu',	d2u,	dude	\
		    dxbde',	dude,	d2e)
}
end
exit
***** END: rereg_gf2.mata

To show that the above evaluator works properly, here is an example that
fits a model using -moptimize()- with -rereg_gf2()- and compares the results
with -xtreg, mle-.

Notice that we -include- the Mata source for 'rereg_gf2.mata' at the top of
our example, we need to define -rereg_gf2()- for Mata before we can start
making references to it.  The -clear mata- allows us to rerun our example
without having Mata complain that -rereg_gf2()- is already defined.

***** BEGIN: example.do
clear mata
include rereg_gf2.mata

use http://www.stata-press.com/data/ml4/tablef7-1

// starting values for the constant-only model
sum lnC, mean
scalar xb = r(mean)
quietly oneway lnC i
local np	= r(df_m) + 1
local N		= r(N) + 1
local bms	= r(mss)/r(df_m)	// between mean squares
local wms	= r(rss)/r(df_r)	// within mean squares
scalar lns_u	= log( (`bms'-`wms')*`np'/`N' )/2
scalar lns_e	= log(`wms')/2

sort i

mata:

// setup for the constant-only model
M = moptimize_init()
moptimize_init_evaluator(M, &rereg_gf2())
moptimize_init_evaluatortype(M, "gf2")
moptimize_init_valueid(M, "log-likelihood")

// variable that identifies the panels
st_view(panel=., ., "i")
moptimize_init_userinfo(M, 1, panel)
moptimize_init_by(M, panel)	// needed by -moptimize_util_matbysum()-

// setup for the equations, including starting values
moptimize_init_eq_n(M, 3)
moptimize_init_eq_name(M, 1, "xb")
moptimize_init_eq_name(M, 2, "lns_u")
moptimize_init_eq_name(M, 3, "lns_e")
moptimize_init_depvar(M, 1, "lnC")
moptimize_init_eq_coefs(M, 1, st_numscalar("xb"))
moptimize_init_eq_coefs(M, 2, st_numscalar("lns_u"))
moptimize_init_eq_coefs(M, 3, st_numscalar("lns_e"))

// we've supplied the starting values, so we turn searching off
moptimize_init_search(M, "off")

// fit the constant-only model
moptimize(M)
moptimize_result_display(M)

// now use the resulting fit as starting values for another model that
// contains independent variables
b0 = moptimize_result_coefs(M)
moptimize_init_eq_indepvars(M, 1, "lnQ lnPF")
moptimize_init_eq_coefs(M, 1, (0,0,b0[1]))
moptimize_init_eq_coefs(M, 2, b0[2])
moptimize_init_eq_coefs(M, 3, b0[3])
moptimize(M)
moptimize_result_display(M)

// store the fitted coefficients in a Stata matrix so we can compare with the
// results from -xtreg, mle-
st_matrix("myb", moptimize_result_coefs(M))

end

xtset i
xtreg lnC lnQ lnPF , mle
matrix b = e(b)
local dim = colsof(b)
// transform the scale parameters to compare with -rereg_gf2()-'s results
matrix b[1,`dim'-1] = ln(b[1,`dim'-1])
matrix b[1,`dim'] = ln(b[1,`dim'])

// the maximum relative difference between the two model fits should be small
di mreldif(b, myb)
***** END: example.do

Here is a log of the output from running the above example:

***** BEGIN: example.log
. include rereg_gf2.mata

. 
. version 11

. 
. mata:
------------------------------------------------- mata (type end to exit) -----
: 
: void rereg_gf2(         scalar  M,
>                         real    scalar  todo,
>                         real    vector  b,
>                         real    vector  lnfj,
>                         real    matrix  S,
>                         real    matrix  H)
> {
>         real    scalar  s2_u
>         real    scalar  s2_e
> 
>         real    matrix  info
>         real    scalar  k
>         real    scalar  i
> 
>         real    matrix  sub
>         real    vector  z
>         real    vector  S_z2
>         real    vector  Sz_2
>         real    vector  T
>         real    vector  a
>         real    matrix  X
>         real    scalar  Xobs
>         real    matrix  subX
>         real    matrix  part
>         real    vector  S_z
> 
>         real    vector  d2xb1
>         real    vector  d2xb2
>         real    vector  d2xb
>         real    vector  d2u
>         real    vector  d2e
>         real    vector  dxbdu
>         real    vector  dxbde
>         real    vector  dude
>         real    vector  lnf
> 
>         // random effects scale parameter
>         s2_u    = exp(2*moptimize_util_xb(M,b,2))       // scalar
>         // residual error scale parameter
>         s2_e    = exp(2*moptimize_util_xb(M,b,3))       // scalar
> 
>         // use the panel identifier to get the information we need to perform
>         // calculations at the panel level -- the data is assumed already
>         // sorted
>         info    = panelsetup(moptimize_init_userinfo(M,1), 1)
>         // k is the number of panels
>         k       = rows(info)
> 
>         z       = moptimize_init_depvar(M,1) :- moptimize_util_xb(M,b,1)
>         S_z2    = J(k,1,.)
>         Sz_2    = J(k,1,.)
>         T       = J(k,1,.)
>         a       = J(k,1,.)
> 
>         // compute the log-likelihood values for each panel
>         for (i=1; i<=k; i++) {
>                 sub     = panelsubmatrix(z, i, info)
>                 S_z2[i] = cross(sub,sub)
>                 Sz_2[i] = quadsum(sub)^2
>                 T[i]    = info[i,2] - info[i,1] + 1
>                 a[i]    = s2_u / (T[i]*s2_u + s2_e)
>         }
>         lnfj    = -.5 * ((S_z2 :- a :* Sz_2)/s2_e :+
>                          ln(T :* s2_u/s2_e :+ 1) :+
>                          T:*ln(2*c("pi")*s2_e))
>         if (todo == 0 | missing(lnfj)) return
> 
>         // compute the scores -- derivative of the above log-likelihood values
>         // with respect to each coefficient in the model
>         X       = moptimize_util_indepvars(M, 1)
>         Xobs    = (rows(X) > 1)
>         S       = J(k,cols(b),.)
>         S_z     = J(k,1,.)
>         part    = moptimize_util_eq_indices(M,1)'
>         subX    = 1
>         // derivatives with respect to the regression coefficients
>         for (i=1; i<=k; i++) {
>                 sub     = panelsubmatrix(z, i, info)
>                 S_z[i]  = quadsum(sub)
>                 if (Xobs) {
>                         subX = panelsubmatrix(X,i,info)
>                 }
>                 part[1,1] = part[2,1] = i
>                 S[|part|] = quadcolsum( ((sub :- a[i]*S_z[i])/s2_e) :* subX )
>         }
>         // derivatives with respect to ln(s_u)
>         part    = moptimize_util_eq_indices(M,2)[2,2]
>         S[.,part] = a:^2 :* Sz_2/s2_u :- T :* a
>         // derivatives with respect to ln(s_e)
>         part++
>         S[.,part] = S_z2 :/ s2_e :-
>                     a :* Sz_2 :/ s2_e :-
>                     (a:^2) :* Sz_2 :/ s2_u :-
>                     T :+ 1 :- a :* s2_e/s2_u
>         if (todo == 1 | missing(S)) return
> 
>         // compute the Hessian
>         lnf     = 1
>         d2u     = colsum(       2*a:^2:*Sz_2:/s2_u :-
>                                 4*s2_e*a:^3:*Sz_2:/s2_u^2 :+
>                                 2*T:*a:^2*s2_e/s2_u             )
>         d2e     = colsum(       2*(S_z2 :- a :* Sz_2)/s2_e :-
>                                 2*a:^2:*Sz_2/s2_u :-
>                                 4*a:^3:*Sz_2*s2_e/s2_u^2 :+
>                                 2*a*s2_e/s2_u :-
>                                 2*a:^2*s2_e^2/s2_u^2            )
>         dude    = colsum(       4*a:^3:*Sz_2*s2_e/s2_u^2 :-
>                                 2*T:*a:^2*s2_e/s2_u             )
> 
>         dxbdu   = J(rows(z),1,.)
>         dxbde   = J(rows(z),1,.)
>         part    = J(2,2,1)
>         for (i=1; i<=k; i++) {
>                 sub     = panelsubmatrix(z, i, info)
>                 part[1,1]       = info[i,1]
>                 part[2,1]       = info[i,2]
>                 dxbdu[|part|]   = J(T[i],1,2*a[i]^2*S_z[i]/s2_u)
>                 dxbde[|part|]   = 2*(sub :- a[i]*S_z[i])/s2_e :-
>                                   2*a[i]^2*S_z[i]/s2_u
>         }
>         dxbdu = moptimize_util_matsum(M, 1, 2, dxbdu, lnf)
>         dxbde = moptimize_util_matsum(M, 1, 3, dxbde, lnf)
> 
>         d2xb2   = moptimize_util_matsum(M, 1, 1, 1, lnf)
>         sub     = J(rows(z),1,.)
>         part    = J(2,2,1)
>         for (i=1; i<=k; i++) {
>                 part[1,1]       = info[i,1]
>                 part[2,1]       = info[i,2]
>                 sub[|part|]     = J(T[i],1,0)
>                 sub[info[i,2]]  = a[i]
>         }
>         d2xb1   = moptimize_util_matbysum(M, 1, sub, J(rows(z),1,1), lnf)
>         d2xb    = (d2xb2 - d2xb1)/s2_e
>         H       = -(d2xb,       dxbdu,  dxbde   \
>                     dxbdu',     d2u,    dude    \
>                     dxbde',     dude,   d2e)
> }

: 
: mata mosave rereg_gf2(), replace
(file rereg_gf2.mo created)

: end
-------------------------------------------------------------------------------

. exit

. 
. use http://www.stata-press.com/data/ml4/tablef7-1

. 
. // starting values for the constant-only model
. sum lnC, mean

. scalar xb = r(mean)

. quietly oneway lnC i

. local np        = r(df_m) + 1

. local N         = r(N) + 1

. local bms       = r(mss)/r(df_m)        // between mean squares

. local wms       = r(rss)/r(df_r)        // within mean squares

. scalar lns_u    = log( (`bms'-`wms')*`np'/`N' )/2

. scalar lns_e    = log(`wms')/2

. 
. sort i

. 
. mata:
------------------------------------------------- mata (type end to exit) -----
: 
: // setup for the constant-only model
: M = moptimize_init()

: moptimize_init_evaluator(M, &rereg_gf2())

: moptimize_init_evaluatortype(M, "gf2")

: moptimize_init_valueid(M, "log-likelihood")

: 
: // variable that identifies the panels
: st_view(panel=., ., "i")

: moptimize_init_userinfo(M, 1, panel)

: moptimize_init_by(M, panel)     // needed by -moptimize_util_matbysum()-

: 
: // setup for the equations, including starting values
: moptimize_init_eq_n(M, 3)

: moptimize_init_eq_name(M, 1, "xb")

: moptimize_init_eq_name(M, 2, "lns_u")

: moptimize_init_eq_name(M, 3, "lns_e")

: moptimize_init_depvar(M, 1, "lnC")

: moptimize_init_eq_coefs(M, 1, st_numscalar("xb"))

: moptimize_init_eq_coefs(M, 2, st_numscalar("lns_u"))

: moptimize_init_eq_coefs(M, 3, st_numscalar("lns_e"))

: 
: // we've supplied the starting values, so we turn searching off
: moptimize_init_search(M, "off")

: 
: // fit the constant-only model
: moptimize(M)
Iteration 0:   log-likelihood = -103.47286  
Iteration 1:   log-likelihood = -103.43139  
Iteration 2:   log-likelihood =  -103.4311  
Iteration 3:   log-likelihood =  -103.4311  

: moptimize_result_display(M)

                                                  Number of obs   =         90

------------------------------------------------------------------------------
         lnC |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
xb           |
       _cons |   13.36561   .3718818    35.94   0.000     12.63673    14.09448
-------------+----------------------------------------------------------------
lns_u        |
       _cons |  -.1124866   .2999833    -0.37   0.708    -.7004432    .4754699
-------------+----------------------------------------------------------------
lns_e        |
       _cons |  -.3790205   .0771517    -4.91   0.000     -.530235    -.227806
------------------------------------------------------------------------------

: 
: // now use the resulting fit as starting values for another model that
: // contains independent variables
: b0 = moptimize_result_coefs(M)

: moptimize_init_eq_indepvars(M, 1, "lnQ lnPF")

: moptimize_init_eq_coefs(M, 1, (0,0,b0[1]))

: moptimize_init_eq_coefs(M, 2, b0[2])

: moptimize_init_eq_coefs(M, 3, b0[3])

: moptimize(M)
Iteration 0:   log-likelihood =  -103.4311  (not concave)
Iteration 1:   log-likelihood = -36.516249  (not concave)
Iteration 2:   log-likelihood =  25.743727  (not concave)
Iteration 3:   log-likelihood =    38.3557  (not concave)
Iteration 4:   log-likelihood =  52.292856  (not concave)
Iteration 5:   log-likelihood =   62.34649  (not concave)
Iteration 6:   log-likelihood =   70.57777  (not concave)
Iteration 7:   log-likelihood =  80.016528  
Iteration 8:   log-likelihood =  98.680215  
Iteration 9:   log-likelihood =  102.04372  
Iteration 10:  log-likelihood =  102.05915  
Iteration 11:  log-likelihood =  102.05915  

: moptimize_result_display(M)

                                                  Number of obs   =         90

------------------------------------------------------------------------------
         lnC |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
xb           |
         lnQ |   .8649719   .0264724    32.67   0.000     .8130869    .9168569
        lnPF |   .3993113   .0146384    27.28   0.000     .3706205    .4280021
       _cons |   9.282005   .2179675    42.58   0.000     8.854796    9.709213
-------------+----------------------------------------------------------------
lns_u        |
       _cons |  -2.131847    .295808    -7.21   0.000     -2.71162   -1.552074
-------------+----------------------------------------------------------------
lns_e        |
       _cons |  -2.680509   .0771648   -34.74   0.000    -2.831749   -2.529268
------------------------------------------------------------------------------

: 
: // store the fitted coefficients in a Stata matrix so we can compare with the
: // results from -xtreg, mle-
: st_matrix("myb", moptimize_result_coefs(M))

: 
: end
-------------------------------------------------------------------------------

. 
. xtset i
       panel variable:  i (balanced)

. xtreg lnC lnQ lnPF , mle

Fitting constant-only model:
Iteration 0:   log likelihood = -3679.9546
Iteration 1:   log likelihood = -2042.3896
Iteration 2:   log likelihood = -1132.4148
Iteration 3:   log likelihood = -631.57621
Iteration 4:   log likelihood = -360.64085
Iteration 5:   log likelihood = -218.61016
Iteration 6:   log likelihood = -148.36245
Iteration 7:   log likelihood = -117.25507
Iteration 8:   log likelihood = -106.17927
Iteration 9:   log likelihood = -103.65887
Iteration 10:  log likelihood = -103.43395
Iteration 11:  log likelihood =  -103.4311

Fitting full model:
Iteration 0:   log likelihood =  101.61076
Iteration 1:   log likelihood =  102.03853
Iteration 2:   log likelihood =  102.05913
Iteration 3:   log likelihood =  102.05915

Random-effects ML regression                    Number of obs      =        90
Group variable: i                               Number of groups   =         6

Random effects u_i ~ Gaussian                   Obs per group: min =        15
                                                               avg =      15.0
                                                               max =        15

                                                LR chi2(2)         =    410.98
Log likelihood  =  102.05915                    Prob > chi2        =    0.0000

------------------------------------------------------------------------------
         lnC |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         lnQ |   .8649719   .0264724    32.67   0.000      .813087    .9168569
        lnPF |   .3993113   .0146384    27.28   0.000     .3706205    .4280021
       _cons |   9.282005   .2179674    42.58   0.000     8.854796    9.709213
-------------+----------------------------------------------------------------
    /sigma_u |   .1186181   .0350882                      .0664291    .2118083
    /sigma_e |   .0685283    .005288                      .0589098    .0797173
         rho |   .7497583    .114912                      .4861774    .9165317
------------------------------------------------------------------------------
Likelihood-ratio test of sigma_u=0: chibar2(01)=  101.26 Prob>=chibar2 = 0.000

. matrix b = e(b)

. local dim = colsof(b)

. // transform the scale parameters to compare with -rereg_gf2()-'s results
. matrix b[1,`dim'-1] = ln(b[1,`dim'-1])

. matrix b[1,`dim'] = ln(b[1,`dim'])

. 
. // the maximum relative difference between the two model fits should be small
. di mreldif(b, myb)
5.354e-08
***** END: example.log

Reference:

Gould, W., J. Pitblado, and B. Poi. (2010) Maximum likelihood estimation with
	Stata.  StataPress: CS, TX.

--Jeff
jpitblado@stata.com
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index