**[R] nlsur** -- Estimation of nonlinear systems of equations

__Syntax__

Interactive version

**nlsur** **(***depvar_1***=**<*sexp_1*>**)** **(***depvar_2***=**<*sexp_2*>**)** ... [*if*] [*in*] [*weight*]
[**,** *options*]

Programmed substitutable expression version

**nlsur** *sexp_prog* **:** *depvar_1* *depvar_2* ... [*varlist*] [*if*] [*in*] [*weight*]
[**,** *options*]

Function evaluator program version

**nlsur** *func_prog* **@** *depvar_1* *depvar_2* ... [*varlist*] [*if*] [*in*] [*weight*]
**,** __neq__**uations(***#***)** {__param__**eters(***namelist***)**|__nparam__**eters(***#***)**}
[*options*]

where

*depvar_j* is the dependent variable for equation *j*;
*<sexp>_j* is the substitutable expression for equation *j*;
*sexp_prog* is a substitutable expression program; and
*func_prog* is a function evaluator program.

*options* Description
-------------------------------------------------------------------------
Model
**fgnls** use two-step FGNLS estimator; the default
**ifgnls** use iterative FGNLS estimator
**nls** use NLS estimator
__va__**riables(***varlist***)** variables in model
__in__**itial(***initial_values***)** initial values for parameters
__neq__**uations(***#***)** number of equations in model (function
evaluator program version only)
* __param__**eters(***namelist***)** parameters in model (function evaluator
program version only)
* __nparam__**eters(***#***)** number of parameters in model (function
evaluator program version only)
*sexp_options* options for substitutable expression program
*func_options* options for function evaluator program

SE/Robust
**vce(***vcetype***)** *vcetype* may be **gnr**, __r__**obust**, __cl__**uster**
*clustvar*, __boot__**strap**, or __jack__**knife**

Reporting
__l__**evel(***#***)** set confidence level; default is **level(95)**
**title(***string***)** display *string* as title above the table of
parameter estimates
**title2(***string***)** display *string* as subtitle
*display_options* control columns and column formats and line
width

Optimization
*optimization_options* control the optimization process; seldom
used
**eps(***#***)** specify *#* for convergence criteria; default
is **eps(1e-5)**
__ifgnlsi__**terate(***#***)** set maximum number of FGNLS iterations
**ifgnlseps(***#***)** specify *#* for FGNLS convergence criterion;
default is **ifgnlseps(1e-10)**
__del__**ta(***#***)** specify stepsize *#* for computing
derivatives; default is **delta(4e-7)**
__nocons__**tants** no equations have constant terms
__h__**asconstants(***namelist***)** use *namelist* as constant terms

__coefl__**egend** display legend instead of statistics
-------------------------------------------------------------------------
* You must specify **parameters(***namelist***)**, **nparameters(***#***)**, or both.
**bootstrap**, **by**, **jackknife**, **rolling**, and **statsby** are allowed; see prefix.
Weights are not allowed with the **bootstrap** prefix.
**aweight**s are not allowed with the **jackknife** prefix.
**aweight**s, **fweight**s, **iweight**s, and **pweight**s are allowed; see weight.
**coeflegend** does not appear in the dialog box.
See **[R] nlsur postestimation** for features available after estimation.

__Menu__

**Statistics > Linear models and related > Multiple-equation models >**
**Nonlinear seemingly unrelated regression**

__Description__

**nlsur** fits a system of nonlinear equations by feasible generalized
nonlinear least squares (FGNLS). With the interactive version of the
command, you enter the system of equations directly on the command line
or in the dialog box by using substitutable expressions. If you have a
system that you use regularly, you can write a substitutable expression
program and use the second syntax to avoid having to reenter the system
every time. The function evaluator program version gives you the most
flexibility in exchange for increased complexity; with this version, your
program is given a vector of parameters and a variable list, and your
program computes the system of equations.

When you write a substitutable expression program or a function evaluator
program, the first five letters of the name must be **nlsur**. *sexp_prog* and
*func_prog* refer to the name of the program without the first five
letters. For example, if you wrote a function evaluator program named
**nlsurregss**, you would type **nlsur regss @ ...** to estimate the parameters.

__Options__

+-------+
----+ Model +------------------------------------------------------------

**fgnls** requests the two-step FGNLS estimator; this is the default.

**ifgnls** requests the iterative FGNLS estimator. For the nonlinear systems
estimator, this is equivalent to maximum likelihood estimation.

**nls** requests the nonlinear least-squares (NLS) estimator.

**variables(***varlist***)** specifies the variables in the system. **nlsur** ignores
observations for which any of these variables has missing values. If
you do not specify **variables()**, **nlsur** issues an error message if the
estimation sample contains any missing values.

**initial(***initial_values***)** specifies the initial values to begin the
estimation. You can specify a 1 x k matrix, where k is the total
number of parameters in the system, or you can specify a parameter
name, its initial value, another parameter name, its initial value,
and so on. For example, to initialize **alpha** to 1.23 and **delta** to
4.57, you would type

**nlsur ... , initial(alpha 1.23 delta 4.57) ...**

Initial values declared using this option override any that are
declared within substitutable expressions. If you specify a matrix,
the values must be in the same order in which the parameters are
declared in your model. **nlsur** ignores the row and column names of
the matrix.

**nequations(***#***)** specifies the number of equations in the system.

**parameters(***namelist***)** specifies the names of the parameters in the system.
The names of the parameters must adhere to the naming conventions of
Stata's variables; see **[U] 11.3 Naming conventions**. If you specify
both **parameters()** and **nparameters()**, the number of names in the
former must match the number specified in the latter.

**nparameters(***#***)** specifies the number of parameters in the system. If you
do not specify names with the **parameters()** option, **nlsur** names them
**b1**, **b2**, ..., **b***#*. If you specify both **parameters()** and **nparameters()**,
the number of names in the former must match the number specified in
the latter.

*sexp_options* refer to any options allowed by your *sexp_prog*.

*func_options* refer to any options allowed by your *func_prog*.

+-----------+
----+ SE/Robust +--------------------------------------------------------

**vce(***vcetype***)** specifies the type of standard error reported, which
includes types that are derived from asymptotic theory (**gnr**), that
are robust to some kinds of misspecification (**robust**), that allow for
intragroup correlation (**cluster** *clustvar*), and that use bootstrap or
jackknife methods (**bootstrap**, **jackknife**); see **[R] ***vce_option*.

**vce(gnr)**, the default, uses the conventionally derived variance
estimator for nonlinear models fit using Gauss-Newton regression.

+-----------+
----+ Reporting +--------------------------------------------------------

**level(***#***)**; see **[R] estimation options**.

**title(***string***)** specifies an optional title that will be displayed just
above the table of parameter estimates.

**title2(***string***)** specifies an optional subtitle that will be displayed
between the title specified in **title()** and the table of parameter
estimates. If **title2()** is specified but **title()** is not, **title2()** has
the same effect as **title()**.

*display_options*: **noci**, __nopv__**alues**, **cformat(***%fmt***)**, **pformat(%***fmt***)**,
**sformat(%***fmt***)**, and **nolstretch**; see **[R] estimation options**.

+--------------+
----+ Optimization +-----------------------------------------------------

*optimization_options*: __it__**erate(***#***)**, [__no__]__lo__**g**, __tr__**ace**. **iterate()** specifies
the maximum number of iterations to use for NLS at each round of
FGNLS estimation. This option is different from **ifgnlsiterate()**,
which controls the maximum rounds of FGNLS estimation to use when the
**ifgnls** option is specified. **log**/**nolog** specifies whether to show the
iteration log, and **trace** specifies that the iteration log should
include the current parameter vector.

**eps(***#***)** specifies the convergence criterion for successive parameter
estimates and for the residual sum of squares. The default is
**eps(1e-5)** (.00001). **eps()** also specifies the convergence criterion
for successive parameter estimates between rounds of iterative FGNLS
estimation when **ifgnls** is specified.

**ifgnlsiterate(***#***)** specifies the maximum number of FGNLS iterations to
perform. The default is the number set using **set maxiter**, which is
16,000 by default. To use this option, you must also specify the
**ifgnls** option.

**ifgnlseps(***#***)** specifies the convergence criterion for successive estimates
of the error covariance matrix during iterative FGNLS estimation.
The default is **ifgnlseps(1e-10)**. To use this option, you must also
specify the **ifgnls** option.

**delta(***#***)** specifies the relative change in a parameter to be used in
computing the numeric derivatives. The derivative for parameter b_i
is computed as {f_i(x_i,b_1,b_2,...,b_i + d, b_[i+1],...) - f_i(x_i,
b_1,b_2,...,b_i, b_[i+1],...)}/d, where d is delta*(b_i + delta).
The default is **delta(4e-7)**.

**noconstants** indicates that none of the equations in the system includes
constant terms. This option is generally not needed, even if there
are no constant terms in the system; though in rare cases without
this option, **nlsur** may claim that there is one or more constant terms
even if there are none.

**hasconstants(***namelist***)** indicates the parameters that are to be treated as
constant terms in the system of equations. The number of elements of
*namelist* must equal the number of equations in the system. The *i*th
entry of *namelist* specifies the constant term in the *i*th equation.
If an equation does not include a constant term, specify a period (**.**)
instead of a parameter name. This option is seldom needed with the
interactive and programmed substitutable expression versions, because
in those cases **nlsur** can almost always find the constant terms
automatically.

The following option is available with **nlsur** but is not shown in the
dialog box:

**coeflegend**; see **[R] estimation options**.

__Remarks__

Remarks are presented under the following headings:

Substitutable expressions
Examples
Substitutable expression programs
Example
Function evaluator programs

__Substitutable expressions__

You use substitutable expressions with the interactive and programmed
substitutable expression versions of **nlsur** to define your system of
equations. Substitutable expressions are just like any other
mathematical expression in Stata, except that the parameters of your
model are bound in braces.

You specify a substitutable expression for each equation in your system,
and you must follow three rules:

1. Parameters of the model are bound in braces: **{b0}**, **{param}**, etc.

2. Initial values are given by including an equal sign and the
initial value inside the braces: **{b1=1.267}**,** {gamma=3}**, etc. If
you do not specify an initial value, that parameter is
initialized to zero. The **initial()** option overrides initial
values provided in substitutable expressions.

3. Linear combinations can be included using the notation
**{***eqname***:***varlist***}**:

**{xb:mpg price weight}** is equivalent to
**{xb_mpg}*mpg +** **{xb_price}*price +** **{xb_weight}*weight**

__Examples__

1. To fit the system of equations

y1 = a1 + b1*x^g1

y2 = a2 + b2*x^g2

by iterative FGNLS, where a1, a2, b1, b2, g1, and g2 are
parameters and 1 is a reasonable starting value for both g1 and
g2, you would type

**. nlsur (y1 = {a1} + {b1}*x^{g1=1}) (y2 = {a2} +**
**{b2}*x^{g2=1}), ifgnls**

2. **nlsur** makes imposing cross-equation parameter restrictions easy.
Say that you want to fit a pair of exponential growth equations
with the restriction that the constant terms in the two equations
are equal,

y1 = a + b1*b2^x

y2 = a + c1*c2^x

where a, b1, b2, c1, and c2 are the parameters. To fit this
model, you would type

**. nlsur (y1 = {a} + {b1}*{b2}^x) (y2 = {a} + {c1}*{c2}^x)**

__Substitutable expression programs -- ____sexp_prog____s__

If you intend to fit the same nonlinear system multiple times, then you
can write a substitutable expression program and avoid having to reenter
the equations every time. The first five letters of the program name
must be **nlsur**, and the program is to accept a *varlist*, an **if** *exp*, and,
optionally, weights. The program must return in **r(n_eq)** the number of
equations in the system and in **r(eq_1)** through **r(eq_***m***)** the *m* equations in
the system. You can add a title to the output by storing a string in
**r(title)**.

The outline of an **nlsur***sexp_prog* program is

**program nlsur***sexp_prog***, rclass**
**version 15.1**
**syntax ***varlist* **[aw fw iw]** **[if]**
*(obtain initial parameters if desired)*
**return scalar n_eq = ***<neqn>*
**return local eq_1 "***<sexp>_1***"**
**...**
**return local eq_***m*** "***<sexp>_m***"**
**return local title "***title***"**
**end**

__Example__

Returning to the model

y1 = a1 + b1*x^g1

y2 = a2 + b2*x^g2

one way to obtain initial values is to let g1 = g2 = 1 and then fit a
regression of y1 on x to obtain initial values for a1 and b1 and,
similarly, fit a regression of y2 on x to obtain initial values for
a2 and b2. The substitutable expression program is

**program nlsurmyreg, rclass**

**version 15.1**
**syntax varlist(min=3 max=3) [aw fw iw] [if]**
**local y1: word 1 of `varlist'**
**local y2: word 2 of `varlist'**
**local x : word 3 of `varlist'**

**// Obtain starting values assuming g1=g2=1**
**regress `y1' `x' [`weight'`exp'] `if'**
**local a1 = _b[_cons]**
**local b1 = _b[`x']**
**regress `y2' `x' [`weight'`exp'] `if'**
**local a2 = _b[_cons]**
**local b2 = _b[`x']**

**return scalar n_eq = 2**
**return local eq_1 "`y1' = {a1=`a1'} + {b1=`b1'}*`x'^{g1=1}"**
**return local eq_2 "`y2' = {a2=`a2'} + {b2=`b2'}*`x'^{g2=1}"**

**end**

To fit your model, you type

**. nlsur myreg: y1 y2 x**

(There is a space between **nlsur** and **myreg**, even though the program is
named **nlsurmyreg**.)

The substitutable expression does not need to account for weights or the
**if** *exp*, though we did need to use them in obtaining initial values.

__Function evaluator programs -- ____func_prog____s__

If your system of equations is particularly complex, then you may find
that writing a substitutable expression for each equation is impractical.
In those cases, you can write a function evaluator program instead.
Whenever **nlsur** needs to evaluate the system of equations, it calls your
program with a vector of parameters. Your program then fills in the
dependent variables with the corresponding function values.

Function evaluator programs must accept a *varlist*, an **if** *exp*, and an
option named **at()** that accepts the name of a matrix. They may optionally
accept weights as well.

To illustrate the mechanics of a function evaluator program, we focus on
a straightforward example. We want to fit the model

y1 = b0 + b1*x1 + b2*x2

y2 = c0 + c1*x2 + c2*x3

(Because this system is linear, we could in fact just use **sureg**.) Our
function evaluator program is

**program nlsursur**

**version 15.1**
**syntax varlist(min=5 max=5) [if], at(name)**
**local y1: word 1 of `varlist'**
**local y2: word 2 of `varlist'**
**local x1: word 3 of `varlist'**
**local x2: word 4 of `varlist'**
**local x3: word 5 of `varlist'**

**// Retrieve parameters out of `at'**
**tempname b0 b1 b2 c0 c1 c2**
**scalar `b0' = `at'[1,1]**
**scalar `b1' = `at'[1,2]**
**scalar `b2' = `at'[1,3]**
**scalar `c0' = `at'[1,4]**
**scalar `c1' = `at'[1,5]**
**scalar `c2' = `at'[1,6]**

**// Fill in dependent variables**
**quietly replace `y1' = `b0' + `b1'*`x1' + `b2'*`x2' `if'**
**quietly replace `y2' = `c0' + `c1'*`x2' + `c2'*`x3' `if'**

**end**

Our model has a total of five variables, so we made the **syntax** statement
accept five variables. **nlsur** requires that our program accept an **if**
clause and an option named **at()** by which **nlsur** passes a matrix holding
the parameter values. The order in which you store the elements of the
*varlist* will determine the order in which you specify the variables when
calling **nlsur**.

Because we do not plan to use weighted estimation with this model, we did
not make our **syntax** statement accept weights. If you do intend to
perform weighted estimation, the **syntax** statement must accept them. When
you replace the dependent variables with the values of the functions, you
do not need to do anything with the weights, though if you use estimation
or descriptive statistical commands when evaluating your functions, the
weight expression must be passed onto those commands.

Our model has six parameters, so our program will receive a 1 x 6 row
vector named `at'. We extract the six parameters from that vector and
store them in temporary scalars. We could have referred directly to the
elements of the `at' vector when evaluating the functions of our model,
but storing the parameters in appropriately named scalars makes the
process more transparent. The final part of our program computes the two
equations.

There are several different ways we can call **nlsur** to fit our model.

1. This method uses the shortest syntax and initializes all the
parameters to zero, which is the default:

**. nlsur sur @ y1 y2 x1 x2 x3, nparameters(6) nequations(2)**

Because we did not specify names for the parameters, they will be
labeled b1 through b6 in the output.

2. Here we give names to the parameters and initialize the two
constant terms to be 10:

**. nlsur sur @ y1 y2 x1 x2 x3, parameters(b0 b1 b2 c0 c1 c2)**
**initial(b0 10 c0 10) nequations(2)**

3. When you use a function evaluator program, **nlsur** does not attempt
to identify a constant term in each equation, so the R-squared
statistics reported in the header of the output are uncentered.
You use the **hasconstants()** option to indicate which parameter in
each equation is a constant:

**. nlsur sur @ y1 y2 x1 x2 x3, parameters(b0 b1 b2 c0 c2 c2)**
**initial(b0 10 c0 10) nequations(2) hasconstant(b0 c0)**

Now **nlsur** takes b0 to be the constant in the first equation and
c0 in the second, and centered R-squared statistics will be
reported.

__Example__

Setup
**. webuse petridish**

Model growth of two bacteria populations over time; allow for correlation
between error terms
**. nlsur (p1 = {b1}*{b2}^t) (p2 = {g1}*{g2}^t)**

__Stored results__

**nlsur** stores the following in **e()**:

Scalars
**e(N)** number of observations
**e(k)** number of parameters
**e(k_***#***)** number of parameters for equation *#*
**e(k_eq)** number of equation names in **e(b)**
**e(k_eq_model)** number of equations in overall model test
**e(n_eq)** number of equations
**e(mss_***#***)** model sum of squares for equation *#*
**e(rss_***#***)** RSS for equation *#*
**e(rmse_***#***)** root mean squared error for equation *#*
**e(r2_***#***)** R squared for equation *#*
**e(ll)** Gaussian log likelihood (**iflgs** version only)
**e(N_clust)** number of clusters
**e(rank)** rank of **e(V)**
**e(converged)** **1** if converged, **0** otherwise

Macros
**e(cmd)** **nlsur**
**e(cmdline)** command as typed
**e(method)** **fgnls**, **ifgnls**, or **nls**
**e(depvar)** names of dependent variables
**e(depvar_***#***)** dependent variable for equation *#*
**e(wtype)** weight type
**e(wexp)** weight expression
**e(title)** title in estimation output
**e(title_2)** secondary title in estimation output
**e(clustvar)** name of cluster variable
**e(vce)** *vcetype* specified in **vce()**
**e(vcetype)** title used to label Std. Err.
**e(type)** **1** = interactively entered expression
**2** = substitutable expression program
**3** = function evaluator program
**e(sexpprog)** substitutable expression program
**e(sexp_***#***)** substitutable expression for equation *#*
**e(params)** names of all parameters
**e(params_***#***)** parameters in equation *#*
**e(funcprog)** function evaluator program
**e(rhs)** contents of **variables()**
**e(constants)** identifies constant terms
**e(properties)** **b V**
**e(predict)** program used to implement **predict**

Matrices
**e(b)** coefficient vector
**e(init)** initial values vector
**e(Sigma)** error covariance matrix (Sigma hat matrix)
**e(V)** variance-covariance matrix of the estimators

Functions
**e(sample)** marks estimation sample