Home  /  Resources & support  /  FAQs  /  Use of ml for nonlinear model
Note:This FAQ is for users of Stata 8.

How do I estimate a nonlinear model using ml?

Title   Use of ml for nonlinear model
Authors Weihua Guan, Gustavo Sanchez, StataCorp

Consider the model

	y = f(x) + e

where y is the outcome, f(x) is a nonlinear form of covariate x, and e is the random error. We can change the equation into

	e = y - f(x)

Assuming that the errors are distributed as N(0,sigma), we are allowed to write the density function, which is also the likelihood function.

			1		   e^2
	f(e) = ------------------- exp(- ---------)
		sqrt(2*_pi)*sigma	2*sigma^2

Then, the log-likelihood function will be

	lnL = -0.5*ln(2*_pi) - ln(sigma) - 0.5*e^2/sigma^2
	    = -0.5*ln(2*_pi) - ln(sigma) - 0.5*(y-f(x))^2/sigma^2

We now need to parse f(x) into several linear combinations of x covariates and other parameters.

Here we use the first example in [R] nl to show how to write the ml program for the same model. The nl program is written in the manual as

 program define nlnexpgr
         if "`1'" == "?" {                  /* if query call ...             */
                 global S_1 "B0 B1"         /*    declare parameters         */
                 global B0=1                /*    and initialize them        */
                 global B1=.1
                 exit
         }
         replace `1'=$B0*(1-exp(-$B1*`2'))   /* otherwise, calculate function */
 end

Using the auto dataset we obtain the following results:

 . sysuse auto, clear

 (1978 Automobile Data)

 . nl nexpgr rep78 headroom
 (output omitted)

 (nexpgr)
 ------------------------------------------------------------------------------
        rep78 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
           B0 |   3.435331   .1604479    21.41   0.000     3.115075    3.755586
           B1 |   1.947203    1.76742     1.10   0.275    -1.580583    5.474988
 ------------------------------------------------------------------------------
  (SE's, P values, CI's, and correlations are asymptotic approximations)

Now let’s try our ml program. Note that

    f(x) = B0*(1-exp(-B1*x)

We need to estimate B0 and B1. Considering the formula, we may treat B0 as a parameter and B1*x as the linear combination xb, which has no constant term. We also need to estimate an additional parameter “sigma”, the standard deviation of the normal distribution.

 program define mlnexpgr
  version 7
  args lnf B1x B0 sigma
  tempvar res
  qui gen double `res' = $ML_y1 - `B0'*(1-exp(-`B1x'))
  qui replace `lnf' = -0.5*ln(2*_pi)-ln(`sigma')-0.5*`res'^2/`sigma'^2
 end
 . ml model lf mlnexpgr (B1: rep78 = headroom, nocons) (B0:) (sigma:)
 
 . set seed 1
 
 . ml search
 (output omitted)
 
 . ml max
 (output omitted)


                                                   Number of obs   =         69
                                                   Wald chi2(1)    =       2.85
 Log likelihood =  -96.543274                       Prob > chi2     =     0.0913
 ------------------------------------------------------------------------------
        rep78 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
 B1           |
     headroom |   1.947196   1.153046     1.69   0.091    -.3127326    4.207125
 -------------+----------------------------------------------------------------
 B0           |
        _cons |   3.435331    .137532    24.98   0.000     3.165773    3.704889
 -------------+----------------------------------------------------------------
 sigma        |
        _cons |   .9804333     .08346    11.75   0.000     .8168547    1.144012
 ------------------------------------------------------------------------------

The estimates we obtained from ml are very close to the those from the nl program. The estimated standard errors are a little different, but we know that they are from different algorithms.

We could extend the ml program to estimate robust variances with the robust option or with the cluster() option.

 . ml model lf mlnexpgr (B1:rep78=headroom,nocons) (B0:) (sigma:), robust 
 
 . set seed 1
 
 . ml max
  (output omitted)

                                                   Number of obs   =         69
                                                   Wald chi2(1)    =       3.97
 Log pseudolikelihood =  -96.543274                Prob > chi2     =     0.0462
                                   (Std. Err. adjusted for 18 clusters in turn)
 ------------------------------------------------------------------------------
              |               Robust
        rep78 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
 B1           |
     headroom |   1.947196   .9767398     1.99   0.046     .0328213    3.861571
 -------------+----------------------------------------------------------------
 B0           |
        _cons |   3.435331   .1145516    29.99   0.000     3.210814    3.659848
 -------------+----------------------------------------------------------------
 sigma        |
        _cons |   .9804333   .0743649    13.18   0.000     .8346808    1.126186
 ------------------------------------------------------------------------------

The robust estimation gives the same estimated coefficients, but it gives adjusted standard errors for the three parameters in this model. We may also apply this program for clustered data:

 . ml model lf mlnexpgr (B1:rep78=headroom,nocons) (B0:) (sigma:), cluster(turn)