# st: mle programming problem

 From "Jitian Sheu" To Subject st: mle programming problem Date Sun, 18 Nov 2007 12:30:50 +0800

```Dear all:
I have encountered some stata programming difficulties which I have
been struggling for over two weeks. I would be very appreciated if someone
can help me out. I am working on a project that concerns the patient
decision at each doctor's appointment. At each appointment, based on his
personal assessment, the patient can decide whether to leave the program. My
model is similar to the simple logit model except that the patient may see
doctors many visits in one treatment episode. Because I hope to incorporate
more features in the future, my plan is to write a program that accounts the
panel feature of my data, that is, the panel structure of logit model.
I start to write the program following the manual's description. In
order to account for sampling weight, I modify the program by summing up the
likelihood value (lnf) and the first derivative (g1) on the last observation
of a patient.

+++++++++++++++program begins from here+++++++++++++++++++++++++++++
capture program drop mylogit
program define mylogit
version 6.0
args todo b lnf g
local m  "\$ML_y1"
local id "\$ML_y2"
local t  "\$ML_y3"
tempvar   xb lnfjt lnfj last
mleval    `xb' = `b', eq(1)
sort    `id' `t'
quietly gen double `lnfjt' = invlogit(`xb') if \$ML_y1==1
quietly replace `lnfjt' = invlogit(-`xb') if \$ML_y1==0
quietly replace `lnfjt' = -1000000 if `lnfjt'==.
by `id': gen double `lnfj' = cond(_n==_N, sum(`lnfjt'), 0)
by `id': gen byte `last' = (_n==_N)
mlsum   `lnf'=`lnfj'
if `todo'==0 | `lnf'==. {exit}
tempvar  gj g1
quietly gen double `gj'= invlogit(-`xb') if \$ML_y1==1
quietly   replace    `gj'= -invlogit(`xb') if \$ML_y1==0
quietly   replace    `gj'= -1000000 if `gj'==.
by `id': gen double `g1' = cond(_n==_N, sum(`gj'), 0)
tempname d
mlvecsum `lnf' `d' = `g1', eq(1)
matrix  `g' = (`d')
end

local   mix_exp = `"pgsabs_n1 lapse_n1 male married"'
ml model d1debug mylogit (visit_drop id visit_n=`mix_exp')
ml check
+++++++++++++++program ends here+++++++++++++++++++++++++++++

It works fine for d0 setting, but I cannot get any good results under d1
setting. The ml routine continues to show that the message that encountering
the flat zone. To verify if anything went wrong in the programming, I have
used d1debug, The following is the output from the log file

++++++++++++++++++++++++++ output from here+++++++++++++++++++++++++
initial:       log likelihood = -5262.5172
rescale:       log likelihood = -1062.2892
----------------------------------------------------------------------------
--
Iteration 0:
d1debug:  Begin derivative-comparison report
---------------------------------
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1  -141.7159   .6956784  -89.34516  -31.67742  -229.5808
d1debug:  numerically calculated gradient (used for stepping):
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1  -91.89873   7.046755  -89.34647  -31.68104  -229.5811
d1debug:  relative difference:
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1   .5362523   .7892718   .0000145   .0001107   9.77e-07
d1debug:  End derivative-comparison report
-----------------------------------
log likelihood = -1062.2892
----------------------------------------------------------------------------
--

Iteration 1:
d1debug:  Begin derivative-comparison report
---------------------------------
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1  -47.60062   8.599492   3.409675   .9904678  -8.790937
d1debug:  numerically calculated gradient (used for stepping):
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1   -7.47316  -4.719453   3.409675   .9904678  -8.790937
d1debug:  relative difference:
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1   4.735832    2.32871   7.55e-10   1.14e-08   2.76e-12
d1debug:  End derivative-comparison report
-----------------------------------
log likelihood =
-963.75566
----------------------------------------------------------------------------
--
Iteration 2:
d1debug:  Begin derivative-comparison report
---------------------------------
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1  -42.00725   9.942545  -.9793901   -.388497  -.5770592
d1debug:  numerically calculated gradient (used for stepping):
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1   .2052727   .2083194  -.9793901   -.388497  -.5770595
d1debug:  relative difference:
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1   35.02322   8.056004   6.07e-09   2.48e-08   1.86e-07
d1debug:  End derivative-comparison report
-----------------------------------
log likelihood =
-962.0815
----------------------------------------------------------------------------
--
Iteration 3:
d1debug:  Begin derivative-comparison report
---------------------------------
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1  -41.95821   9.948897  -.0044665  -.0020055  -.0063814
d1debug:  numerically calculated gradient (used for stepping):
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1  -.0010865  -.0009244  -.0044729  -.0020138   -.006392
d1debug:  relative difference:
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1   41.91158   9.940632   6.41e-06   8.25e-06   .0000105
d1debug:  End derivative-comparison report
-----------------------------------
log likelihood =
-962.07389
----------------------------------------------------------------------------
--
Iteration 4:
d1debug:  Begin derivative-comparison report
---------------------------------
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1  -41.95604   9.949573   6.39e-06   8.24e-06   .0000105
d1debug:  numerically calculated gradient (used for stepping):
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1  -1.17e-08  -9.21e-09  -5.09e-08  -2.79e-08  -7.27e-08
d1debug:  relative difference:
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1   41.95603   9.949573   6.44e-06   8.27e-06   .0000106
d1debug:  End derivative-comparison report
-----------------------------------
log likelihood =
-962.07389
----------------------------------------------------------------------------
--
Number of obs   =       4198
Wald chi2(4)    =      59.49
og likelihood = -962.07389                       Prob > chi2     =
0.0000
----------------------------------------------------------------------------
-
|      Coef.   Std. Err.      z    P>|z|     [95% Conf.
Interval]
-------------+--------------------------------------------------------------
--
pgsabs_n1 |  -.5002439   .1640796    -3.05   0.002     -.821834   -.1786537
lapse_n1 |   .9033341   .1925922     4.69   0.000     .5258603    1.280808
male |    .622521    .138544     4.49   0.000     .3509797    .8940623
married |    .173932   .1530517     1.14   0.256    -.1260438    .4739078
_cons |  -3.079393   .1224509   -25.15   0.000    -3.319393   -2.839394
----------------------------------------------------------------------------
--

++++++++++++++++++++++++++++++output end here++++++++++++++++++++++
++++++++++++++++++++++++++++++output end here++++++++++++++++++++++
++++++++++++++++++++++++++++++output end here++++++++++++++++++++++

There are two things I do not quite understand. The first is that the five
variables are entered as one vector in the stata program. Surprisingly, the
first two seems to be off quite a lot, while the rest looks fine. Although
the first two variables are time-variant (the assessments from each
appointment) and the rest variables are time-invariant (sex and marital
status), I still do not know why the first two variables behave so
differently. In order to verify the possible error, I have changed one line
of command from

mlvecsum `lnf' `d' = `g1', eq(1)
to
mlvecsum `lnf' `d' = `gj', eq(1)

The result suggests that the result from d1 is now consistent with that from
d0 setting.

===========================testing results here======================
===========================testing results here======================

initial:       log likelihood = -5273.5355
rescale:       log likelihood = -1060.5017
----------------------------------------------------------------------------
--
Iteration 0:
d1debug:  Begin derivative-comparison report
---------------------------------
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1   -90.9193   7.283813  -87.46658  -31.02436  -226.1624
d1debug:  numerically calculated gradient (used for stepping):
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1  -90.91995   7.261631  -87.46793   -31.0281  -226.1626
d1debug:  relative difference:
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1   7.11e-06    .002685   .0000153   .0001169   9.96e-07

d1debug:  End derivative-comparison report
-----------------------------------
log likelihood = -1060.5017
----------------------------------------------------------------------------
--
Iteration 1:
d1debug:  Begin derivative-comparison report
---------------------------------
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1  -7.323452  -4.912105   3.320771   .9620532  -8.681316
d1debug:  numerically calculated gradient (used for stepping):
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1  -7.323452  -4.912105   3.320772   .9620532  -8.681316
d1debug:  relative difference:
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1   5.35e-10   5.51e-10   8.44e-10   1.17e-08   6.52e-10
d1debug:  End derivative-comparison report
-----------------------------------
log likelihood =
-963.73673
----------------------------------------------------------------------------
--
Iteration 2:
d1debug:  Begin derivative-comparison report
---------------------------------
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1   -.564821  -.3171562  -.4371455  -.2073282  -1.310501
d1debug:  numerically calculated gradient (used for stepping):
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1  -.5648211  -.3171563  -.4371458  -.2073283  -1.310501
d1debug:  relative difference:
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1   4.10e-08   7.86e-08   2.35e-07   2.81e-08   2.52e-08
d1debug:  End derivative-comparison report
-----------------------------------
log likelihood =
-962.08129
----------------------------------------------------------------------------
--
Iteration 3:
d1debug:  Begin derivative-comparison report
---------------------------------
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1  -.0033874  -.0013826  -.0016199  -.0009095  -.0064454
d1debug:  numerically calculated gradient (used for stepping):
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1  -.0033923  -.0013879  -.0016236  -.0009182  -.0064491
d1debug:  relative difference:
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1   4.86e-06   5.20e-06   3.73e-06   8.65e-06   3.61e-06
d1debug:  End derivative-comparison report
-----------------------------------
log likelihood =
-962.07389
----------------------------------------------------------------------------
--
Iteration 4:
d1debug:  Begin derivative-comparison report
---------------------------------
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1   4.75e-06   5.17e-06   3.68e-06   8.63e-06   3.43e-06
d1debug:  numerically calculated gradient (used for stepping):
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1  -1.25e-07  -3.14e-08  -5.49e-08  -3.00e-08  -2.10e-07
d1debug:  relative difference:
eq1:       eq1:       eq1:       eq1:       eq1:
pgsabs_n1   lapse_n1       male    married      _cons
r1   4.87e-06   5.20e-06   3.73e-06   8.66e-06   3.64e-06
d1debug:  End derivative-comparison report
-----------------------------------
log likelihood =
-962.07389
----------------------------------------------------------------------------
--
Number of obs   =       4198
Wald chi2(4)    =      59.49
Log likelihood = -962.07389                       Prob > chi2     =
0.0000

----------------------------------------------------------------------------
-
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+--------------------------------------------------------------
--
pgsabs_n1 |  -.5002439   .1640796    -3.05   0.002     -.821834   -.1786538
lapse_n1 |    .903334   .1925922     4.69   0.000     .5258601    1.280808
male |    .622521    .138544     4.49   0.000     .3509797    .8940623
married |   .1739319   .1530517     1.14   0.256    -.1260438    .4739077
_cons |  -3.079393   .1224509   -25.15   0.000    -3.319393   -2.839394
----------------------------------------------------------------------------
-

===========================testing results end here=============
==========================================================
==========================================================

I have checked many times but still cannot find the error in my program (It
is a simple the addition of the derivatives over all observations or the
addition of the derivatives over one observation per patient). Yet the
result is completely different. Can anyone tell me how I should program in
face of such the problem?

Many many thanks.

JT

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```