**[R] gmm** -- Generalized method of moments estimation

__Syntax__

Interactive version

**gmm** **(**[*reqname1***:**]*rexp_1***)** **(**[*reqname2***:**]*rexp_2***)** ... [*if*] [*in*] [*weight*]
[**,** *options*]

Moment-evaluator program version

**gmm** *moment_prog* [*if*] [*in*] [*weight*]**,**
{__eq__**uations(***namelist***)**|__neq__**uations(***#***)**}
{__param__**eters(***namelist***)**|__nparam__**eters(***#***)**} [*options*]
[*program_options*]

*reqname_j* is the *j*th residual equation name,
*rexp_j* is the substitutable expression for the *j*th residual equation, and
*moment_prog* is a moment-evaluator program.

*options* Description
-------------------------------------------------------------------------
Model
__deriv__**ative(**[*reqname*|*#*]**/***name* **=** *dexp_jk***)**
specify derivative of *reqname* (or *#*) with
respect to parameter *name*; can be
specified more than once (interactive
version only)
* __two__**step** use two-step GMM estimator; the default
* __one__**step** use one-step GMM estimator
* __i__**gmm** use iterative GMM estimator
__va__**riables(***varlist***)** specify variables in model
**nocommonesample** do not restrict estimation sample to be the
same for all equations

Instruments
__inst__**ruments(**[*reqlist***:**]*varlist*[**,** __nocons__**tant**]**)**
specify instruments; can be specified more
than once
__xtinst__**ruments(**[*reqlist***:**]*varlist***, lags(***#_1***/***#_2***))**
specify panel-style instruments; can be
specified more than once

Weight matrix
__wmat__**rix(***wmtype*[**, **__indep__**endent**]**)**
specify weight matrix; *wmtype* may be
__r__**obust**, __cl__**uster** *clustvar*, **hac** *kernel*
[*lags*], or __un__**adjusted**
__c__**enter** center moments in weight-matrix computation
__winit__**ial(***iwtype*[**, **__indep__**endent**]**)**
specify initial weight matrix; *iwtype* may
be __un__**adjusted**, __i__**dentity**, **xt** *xtspec*, or
the name of a Stata matrix

SE/Robust
**vce(***vcetype*[**, **__indep__**endent**]**)** *vcetype* may be __r__**obust**, __cl__**uster** *clustvar*,
__boot__**strap**, __jack__**knife**, **hac** *kernel* *lags*, or
__un__**adjusted**
__quickd__**erivatives** use alternative method of computing
numerical derivatives for VCE

Reporting
__l__**evel(***#***)** set confidence level; default is **level(95)**
**title(***string***)** display *string* as title above the table of
parameter estimates
**title2(***string***)** display *string* as subtitle
*display_options* control columns and column formats, row
spacing, line width, display of omitted
variables and base and empty cells, and
factor-variable labeling

Optimization
**from(***initial_values***)** specify initial values for parameters
# __igmmit__**erate(***#***)** specify maximum number of iterations for
iterated GMM estimator
# **igmmeps(***#***)** specify # for iterated GMM parameter
convergence criterion; default is
**igmmeps(1e-6)**
# **igmmweps(***#***)** specify # for iterated GMM weight-matrix
convergence criterion; default is
**igmmweps(1e-6)**
*optimization_options* control the optimization process; seldom
used

__coefl__**egend** display legend instead of statistics
-------------------------------------------------------------------------
* You can specify at most one of these options.
# These options may be specified only when **igmm** is specified.

*program_options* Description
-------------------------------------------------------------------------
Model
*evaluator_options* additional options to be passed to the
moment-evaluator program
+ __hasd__**erivatives** moment-evaluator program can calculate
parameter-level derivatives
+ __haslfd__**erivatives** moment-evaluator program can calculate
linear-form derivatives
* __eq__**uations(***namelist***)** specify residual equation names
* __neq__**uations(***#***)** specify number of residual equations
# __param__**eters(***namelist***)** specify parameter names
# __nparam__**eters(***#***)** specify number of parameters
-------------------------------------------------------------------------
+ You may not specify both **hasderivatives** and **haslfderivatives**.
* You must specify **equations(***namelist***)** or **nequations(***#***)**; you may specify
both.
# You must specify **parameters(***namelist***)** or **nparameters(***#***)**; you may
specify both.

*rexp_j* and *dexp_jk* may contain factor variables and time-series
operators; see fvvarlist and tsvarlist.
**bootstrap**, **by**, **jackknife**, **rolling**, and **statsby** are allowed; see prefix.
Weights are not allowed with the **bootstrap** prefix.
**aweight**s are not allowed with the **jackknife** prefix.
**aweight**s, **fweight**s, **iweight**s, and **pweight**s are allowed; see weight.
**coeflegend** does not appear in the dialog box.
See **[R] gmm postestimation** for features available after estimation.

*rexp_j* and *dexp_jk* are substitutable expressions, that is, Stata
expressions that also contain parameters to be estimated. The parameters
are enclosed in curly braces and must satisfy the naming requirements for
variables; **{beta}** is an example of a parameter. The notation
**{***lcname*:*varlist***}** is allowed for linear combinations of multiple
covariates and their parameters. For example, **{xb:** **mpg** **price** **turn** **_cons}**
defines a linear combination of the variables **mpg**, **price**, **turn**, and **_cons**
(the constant term). See *Substitutable expressions* under *Remarks and*
*examples* of **[R] gmm**.

__Menu__

**Statistics > Endogenous covariates > Generalized method of moments**
**estimation**

__Description__

**gmm** performs generalized method of moments (GMM) estimation. With the
interactive version of the command, you enter the residual equation for
each moment condition directly into the dialog box or on the command line
by using substitutable expressions. The moment-evaluator program version
gives you greater flexibility in exchange for increased complexity; with
this version, you write a program in an ado-file that calculates the
moments based on a vector of parameters passed to it.

**gmm** can fit both single- and multiple-equation models. It allows moment
conditions of the form E{**z**_i u_i(**b**)} = **0**, where **z**_i is a vector of
instruments, and u_i(**b**) is an error term, as well as more general moment
conditions of the form E{**h**_i(**z**_i;**b**)} = **0**. **gmm** works with
cross-sectional, time-series, and longitudinal (panel) data.

__Options__

+-------+
----+ Model +------------------------------------------------------------

**derivative(**[*reqname*|*#*]**/***name*** =** *dexp_jk***)** specifies the derivative of
residual equation *reqname* or *#* with respect to parameter *name*. If
*reqname* or *#* is not specified, **gmm** assumes that the derivative
applies to the first residual equation.

For a moment condition of the form E{**z**_ji u_ji(**b**)} = **0**,
**derivative(***j***/***b_k* **=** *dexp_jk***)** is to contain a substitutable expression
for *du_ji / db_k*. If you specified **m** as the *reqname*, then for a
moment condition of the form E{**z**_mi u_mi(**b**)} = 0, you can specify
**derivative(m/***b_k* **=** *dexp_mk***)**, where *m* is the index of **m**.

*dexp_jk* uses the same substitutable expression syntax as is used to
specify residual equations. If you declare a linear combination in a
residual equation, you provide the derivative for the linear
combination; **gmm** then applies the chain rule for you. See example 4
below.

If you do not specify the **derivative()** option, **gmm** calculates
derivatives numerically. You must either specify no derivatives or
specify a derivative for each of the *k* parameters that appears in
each of the *j* residual equations unless the derivative is identically
zero. You cannot specify some analytic derivatives and have **gmm**
compute the rest numerically.

**twostep**, **onestep**, and **igmm** specify which estimator is to be used. You
can specify at most one of these options. **twostep** is the default.

**twostep** requests the two-step GMM estimator. **gmm** obtains parameter
estimates based on the initial weight matrix, computes a new weight
matrix based on those estimates, and then reestimates the parameters
based on that weight matrix.

**onestep** requests the one-step GMM estimator. The parameters are
estimated based on an initial weight matrix, and no updating of the
weight matrix is performed except when calculating the appropriate
variance-covariance (VCE) matrix.

**igmm** requests the iterative GMM estimator. **gmm** obtains parameter
estimates based on the initial weight matrix, computes a new weight
matrix based on those estimates, reestimates the parameters based on
that weight matrix, computes a new weight matrix, and so on, to
convergence. Convergence is declared when the relative change in the
parameter vector is less than **igmmeps()**, the relative change in the
weight matrix is less than **igmmweps()**, or **igmmiterate()** iterations
have been completed. Hall (2005, sec. 2.4 and 3.6) mentions that
there may be gains to finite-sample efficiency from using the
iterative estimator.

**variables(***varlist***)** specifies the variables in the model. **gmm** ignores
observations for which any of these variables has a missing value. If
you do not specify **variables()**, then **gmm** assumes all the observations
are valid and issues an error message if any residual equations
evaluate to missing for any observations at the initial value of the
parameter vector.

**nocommonesample** requests that **gmm** not restrict the estimation sample to
be the same for all equations. By default, **gmm** will restrict the
estimation sample to observations that are available for all
equations in the model, mirroring the behavior of other
multiple-equation estimators such as **nlsur**, **sureg**, or **reg3**. For
certain models, however, different equations can have different
numbers of observations. For these models, you should specify
**nocommonesample**. See the dynamic panel-data examples for one type of
model where this option is needed. You cannot specify weights if you
specify **nocommonesample**.

+-------------+
----+ Instruments +------------------------------------------------------

**instruments(**[*reqlist***:**] *varlist*[**, noconstant**]**)** specifies a list of
instrumental variables to be used. If you specify a single residual
equation, then you do not need to specify the equations to which the
instruments apply; you can omit the *reqlist* and simply specify
**instruments(***varlist***)**. By default, a constant term is included in
*varlist*; to omit the constant term, use the **noconstant** suboption:
**instruments(***varlist***, noconstant)**.

If your model has multiple moment conditions of the form

{ **z**1_i u1(**b**)_i }
E{ ............ } = **0**
{ **z**q_i uq(**b**)_i }

then you can specify multiple corresponding residual equations. Then
specify the *reqname* or an *reqlist* to indicate the residual equations
for which the list of variables is to be used as instruments if you
do not want that list applied to all the residual equations. For
example, you might type

**gmm (main:** *rexp_1***) (***rexp_2***)** **(***rexp_3***), instruments(z1 z2)**
**instruments(2: z3) instruments(main 3: z4)**

Variables **z1** and **z2** will be used as instruments for all three
equations, **z3** will be used as an instrument for the second equation,
and **z4** will be used as an instrument for the first and third
equations. Notice that we chose to supply a name for the first
residual equation but not the second two, identifying each by its
equation number.

*varlist* may contain factor variables and time-series operators; see
fvvarlist and tsvarlist, respectively.

**xtinstruments(**[*reqlist***:**] *varlist***, lags(***#_1***/***#_2***))** is for use with
panel-data models in which the set of available instruments depends
on the time period. As with **instruments()**, you can prefix the list
of variables with residual equation names or numbers to target
instruments to specific equations. Unlike with **instruments()**, a
constant term is not included in *varlist*. You must **xtset** your data
before using this option; see **xtset**.

If you specify

**gmm** ...**, xtinstruments(x, lags(1/.))** ...

then for panel *i* and period *t*, **gmm** uses *x_(i,t-1)*, *x_(i,t-2)*, ...,
*x_i1* as instruments. More generally, specifying **xtinstruments(x,**
**lags(***#_1***,***#_2***))** uses *x_(i,t-#_1)*, ..., *x_(i,t-#_2)* as instruments;
setting *#_2* = **.** requests all available lags. *#_1* and *#_2* must be
zero or positive integers.

**gmm** automatically excludes observations for which no valid
instruments are available. It does, however, include observations for
which only a subset of the lags is available. For example, if you
request that lags one through three be used, then **gmm** will include
the observations for the second and third time periods even though
fewer than three lags are available as instruments.

+---------------+
----+ Weight matrix +----------------------------------------------------

**wmatrix(***wmtype*[**,** **independent**]**)** specifies the type of weight matrix to be
used in conjunction with the two-step and iterated GMM estimators.

Specifying **wmatrix(robust)** requests a weight matrix that is
appropriate when the errors are independent but not necessarily
identically distributed. **wmatrix(robust)** is the default.

Specifying **wmatrix(cluster** *clustvar***)** requests a weight matrix that
accounts for arbitrary correlation among observations within clusters
identified by *clustvar*.

Specifying **wmatrix(hac** *kernel* *#***)** requests a heteroskedasticity- and
autocorrelation-consistent (HAC) weight matrix using the specified
kernel (see below) with *#* lags. The bandwidth of a kernel is equal
to the number of lags plus one.

Specifying **wmatrix(hac** *kernel* **opt** [*#*]**)** requests an HAC weight matrix
using the specified kernel, and the lag order is selected using Newey
and West's (1994) optimal lag-selection algorithm. *#* is an optional
tuning parameter that affects the lag order selected; see the
discussion in **[R] gmm**.

Specifying **wmatrix(hac** *kernel***)** requests an HAC weight matrix using
the specified kernel and *N*-2 lags, where *N* is the sample size.

There are three kernels available for HAC weight matrices, and you
can request each one by using the name used by statisticians or the
name perhaps more familiar to economists:

__ba__**rtlett** or __nw__**est** requests the Bartlett (Newey-West) kernel;

__pa__**rzen** or __ga__**llant** requests the Parzen (Gallant) kernel; and

__qu__**adraticspectral** or __an__**drews** requests the quadratic spectral
(Andrews) kernel.

Specifying **wmatrix(unadjusted)** requests a weight matrix that is
suitable when the errors are homoskedastic. In some applications,
the GMM estimator so constructed is known as the (nonlinear)
two-stage least-squares (2SLS) estimator.

Including the **independent** suboption creates a weight matrix that
assumes moment conditions are independent. This suboption is often
used to replicate other models that can be motivated outside the GMM
framework, such as the estimation of a system of equations by
system-wide 2SLS. This suboption has no effect if only one residual
equation is specified.

**wmatrix()** has no effect if **onestep** is also specified.

**center** requests that the sample moments be centered (demeaned) when
computing GMM weight matrices. By default, centering is not done.

**winitial(***iwtype*[**,** **independent**]**)** specifies the weight matrix to use to
obtain the first-step parameter estimates.

Specifying **winitial(unadjusted)** requests a weight matrix that assumes
the moment conditions are independent and identically distributed.
This matrix is of the form (**Z**'**Z**)^-1, where **Z** represents all the
instruments specified in the **instruments()** option. To avoid a
singular weight matrix, you should specify at least q-1 moment
conditions of the form E{**z**_hi u_hi(**b**)} = **0**, where q is the number of
moment conditions, or you should specify the **independent** suboption.

Including the **independent** suboption creates a weight matrix that
assumes moment conditions are independent. Elements of the weight
matrix corresponding to covariances between two moment conditions are
set equal to zero. This suboption has no effect if only one residual
equation is specified.

**winitial(unadjusted)** is the default.

**winitial(identity)** requests that the identity matrix be used.

**winitial(xt** *xtspec***)** is for use with dynamic panel-data models in
which one of the residual equations is specified in first-differences
form. *xtspec* is a string consisting of the letters "L" and "D", the
length of which is equal to the number of residual equations in the
model. You specify "L" for a residual equation if that residual
equation is written in levels, and you specify "D" for a residual
equation if it is written in first differences; *xtspec* is not case
sensitive. When you specify this option, you can specify at most one
residual equation in levels and one residual equation in first
differences. See the dynamic panel-data examples below.

**winitial(***matname***)** requests that Stata matrix *matname* be used. You
cannot specify the **independent** suboption if you specify
**winitial(***matname***)**.

+-----------+
----+ SE/Robust +--------------------------------------------------------

**vce(***vcetype* [**, independent**]**)** specifies the type of standard error
reported, which includes types that are robust to some kinds of
misspecification (**robust**), that allow for intragroup correlation
(**cluster** *clustvar*), and that use bootstrap or jackknife methods
(**bootstrap**, **jackknife**); see **[R] ***vce_option*.

**vce(unadjusted)** specifies that an unadjusted (nonrobust) VCE matrix
be used; this, along with the **twostep** option, results in the "optimal
two-step GMM" estimates often discussed in textbooks.

The default *vcetype* is based on the *wmtype* specified in the **wmatrix()**
option. If **wmatrix()** is specified but **vce()** is not, then *vcetype* is
set equal to *wmtype*. To override this behavior and obtain an
unadjusted (nonrobust) VCE matrix, specify **vce(unadjusted)**.

Specifying **vce(bootstrap)** or **vce(jackknife)** results in standard
errors based on the bootstrap or jackknife, respectively. See **[R]**
*vce_option*, **[R] bootstrap**, and **[R] jackknife** for more information on
these VCEs.

The syntax for *vcetype*s other than **bootstrap** and **jackknife** is
identical to those for **wmatrix()**.

**quickderivatives** requests that an alternative method be used to compute
the numerical derivatives for the VCE. This option has no effect if
you specify the **derivatives()**, **hasderivatives**, or **haslfderivatives**
option.

The VCE depends on a matrix of partial derivatives that **gmm** must
compute numerically unless you supply analytic derivatives. This
Jacobian matrix will be especially large if your model has many
instruments, residual equations, or parameters.

By default, **gmm** computes each element of the Jacobian matrix
individually, searching for an optimal step size each time. Although
this procedure results in accurate derivatives, it is computationally
taxing: **gmm** may have to evaluate the moments of your model five or
more times for each element of the Jacobian matrix.

When you specify the **quickderivatives** option, **gmm** computes all
derivatives corresponding to a parameter at once, using a fixed step
size proportional to the parameter's value. This method requires
just two evaluations of the model's moments to compute an entire
column of the Jacobian matrix and therefore has the most impact when
you specify many instruments or residual equations.

Most of the time, the two methods produce virtually identical
results, but the **quickderivatives** method may fail if a residual
equation is highly nonlinear or if instruments differ by orders of
magnitude. In the rare case where you specify **quickderivatives** and
obtain suspiciously large or small standard errors, try refitting
your model without this option.

+-----------+
----+ Reporting +--------------------------------------------------------

**level(***#***)**; see **[R] estimation options**.

**title(***string***)** specifies an optional title that will be displayed just
above the table of parameter estimates.

**title2(***string***)** specifies an optional subtitle that will be displayed
between the title specified in **title()** and the table of parameter
estimates. If **title2()** is specified but **title()** is not, **title2()** has
the same effect as **title()**.

*display_options*: **noci**, __nopv__**alues**, __noomit__**ted**, **vsquish**, __noempty__**cells**,
__base__**levels**, __allbase__**levels**, __nofvlab__**el**, **fvwrap(***#***)**, **fvwrapon(***style***)**,
**cformat(***%fmt***)**, **pformat(%***fmt***)**, **sformat(%***fmt***)**, and **nolstretch**; see **[R]**
**estimation options**.

+--------------+
----+ Optimization +-----------------------------------------------------

**from(***initial_values***)** specifies the initial values to begin the
estimation. You can specify a parameter name, its initial value,
another parameter name, its initial value, and so on, or you can
specify a 1 x k matrix, where k is the number of parameters in the
model. For example, to initialize **alpha** to 1.23 and **delta** to 4.57,
you would type

**gmm** ...**,** **from(alpha 1.23 delta 4.57)** ...

or equivalently

**matrix** **define** **initval** **=** **(1.23, 4.57)**
**gmm** ...**,** **from(initval)** ...

Initial values declared in the **from()** option override any that are
declared within substitutable expressions. If you specify a
parameter that does not appear in your model, **gmm** exits with an error
message. If you specify a matrix, the values must be in the same
order in which the parameters are declared in your model.

**igmmiterate(***#***)**, **igmmeps(***#***)**, and **igmmweps(***#***)** control the iterative process
for the iterative GMM estimator. These options can be specified only
if you also specify **igmm**.

**igmmiterate(***#***)** specifies the maximum number of iterations to perform
with the iterative GMM estimator. The default is the number set
using **set maxiter**, which is 16,000 by default.

**igmmeps(***#***)** specifies the convergence criterion used for successive
parameter estimates when the iterative GMM estimator is used.
The default is **igmmeps(1e-6)**. Convergence is declared when the
relative difference between successive parameter estimates is
less than **igmmeps()** and the relative difference between
successive estimates of the weight matrix is less than
**igmmweps()**.

**igmmweps(***#***)** specifies the convergence criterion used for successive
estimates of the weight matrix when the iterative GMM estimator
is used. The default is **igmmweps(1e-6)**. Convergence is declared
when the relative difference between successive parameter
estimates is less than **igmmeps()** and the relative difference
between successive estimates of the weight matrix is less than
**igmmweps()**.

*optimization_options*: __tech__**nique()**, **conv_maxiter()**, **conv_ptol()**,
**conv_vtol()**, **conv_nrtol()**, **tracelevel()**. **technique()** specifies the
optimization technique to use; **gn** (the default), **nr**, **dfp**, and **bfgs**
are allowed. **conv_maxiter()** specifies the maximum number of
iterations; **conv_ptol()**, **conv_vtol()**, and **conv_nrtol()** specify the
convergence criteria for the parameters, gradient, and scaled
Hessian, respectively. **tracelevel()** allows you to obtain additional
details during the iterative process. See **[M-5] optimize()**.

The following options pertain only to the moment-evaluator program
version of **gmm**:

+-------+
----+ Model +------------------------------------------------------------

*evaluator_options* refer to any options allowed by your *moment_prog*.

**hasderivatives** and **haslfderivatives** indicate that you have written your
moment-evaluator program to compute derivatives. You may specify one
or the other but not both. If you do not specify either of these
options, **gmm** computes the derivatives numerically.

**hasderivatives** indicates that your moment-evaluator program computes
parameter-level derivatives.

**haslfderivatives** indicates that your moment-evaluator program
computes equation-level derivatives and is useful only when you
specify the parameters of your model using the **{***lcname***:***varlist***}**
syntax of the **parameters()** option.

See *Details of moment-evaluator programs* in **[R] gmm** for more
information.

**equations(***namelist***)** specifies the names of the residual equations in the
model. If you specify both **equations()** and **nequations()**, the number
of names in the former must match the number specified in the latter.

**nequations(***#***)** specifies the number of residual equations in the model.
If you do not specify names with the **equations()** option, **gmm** numbers
the residual equations 1, 2, 3, .... If you specify both **equations()**
and **nequations()**, the number of names in the former must match the
number specified in the latter.

**parameters(***namelist***)** specifies the names of the parameters in the model.
The names of the parameters must comply with the naming conventions
of Stata's variables; see **[U] 11.3 Naming conventions**.

Alternatively, you can use parameter equation notation to specify
linear combinations of parameters. Each linear combination is of the
form **{***lcname***:***varlist***}**, where *varlist* is one or more variable names.
Specify the system variable **_cons** in *varlist* to include a constant
term. Distinguish between **{***lcname***:***varlist***}**, in which *lcname*
identifies the linear combination, and **(***reqname***:***rex***)**, in which
*reqname* identifies the residual equation. When you use
linear-combination syntax, **gmm** prepends each element of the parameter
vector passed to your evaluator program with *lcname***:** to generate
unique names.

If you specify both **parameters()** and **nparameters()**, the number of
names in the former must match the number specified in the latter.

**nparameters(***#***)** specifies the number of parameters in the model. If you
do not specify names with the **parameters()** option, **gmm** names them **b1**,
**b2**, ..., **b***#*. If you specify both **parameters()** and **nparameters()**, the
number of names in the former must match the number specified in the
latter.

The following option is available with **gmm** but is not shown in the dialog
box:

**coeflegend**; see **[R] estimation options**.

__Remarks__

Remarks are presented under the following headings:

Interactive version
Moment-evaluator program version
Substitutable expressions

__Interactive version__

In many applications, the moment conditions can be written in the form

E{**z**_i u_i(**b**)} = **0**

where *i* indexes observations, **b** is a *p* x 1 vector of parameters, u(**b**) is
a residual term, and **z** represents a vector of one or more instrumental
variables, **z1**, **z2**, ..., **z***q*. Here you would type

**. gmm (***<expression for u_i(***b***)>***), instruments(z1, z2, **...**, z***q***)**

In other applications, we cannot write the moment conditions as the
product of a residual and a list of instruments but instead have the more
general moment conditions

E{**h**_i(**b**)} = **0**

where **h**(**b**) is a *q* x 1 vector-valued function. Here you would type

**. gmm (***<expression for h_1i(***b***)>***)** **(***<expression for h_2i(***b***)>***)** ...
**(***<expression for h_qi(***b***)>***)**

where h_1i(**b**) is the first element of **h**(**b**), and so on.

In yet other applications, your moment conditions might be of the form

{ **z**_1i u_1i(**b**) }
E{ ............ } = **0**
{ **z**_qi u_qi(**b**) }

where **z**_1i is a vector of instrumental variables **z11**, **z12**, ..., **z1**q1,
associated with the first residual term, u_1i(**b**), and so on. Here you
would type

**. gmm (***<expression for u_1i(***b***)>***)**
**(***<expression for u_2i(***b***)>***)** ...
**(***<expression for u_qi(***b***)>***),**
**instruments(1: z11 z12** ... **z1***q1***)**
**instruments(2: z21 z22** ... **z2***q2***)** ...
**instruments(3: z31 z32** ... **z3***q3***)**

Of course, you can also combine moment conditions of the forms E{**h**_i(**b**)}
= **0** and E{**z**_ki u_ki(**b**)} = **0**.

__Moment-evaluator program version__

Instead of defining the moment equations in the dialog box or on the
command line, you can write a program that evaluates them similarly to
how **ml** and the function-evaluator program version of **nl** work. We
illustrate the mechanics of a moment-evaluator program through a simple
example. Suppose we wish to fit the model

y_i = **x**_i'**b** + u_i

where we suspect that some elements of **x** are endogenous. We have as
instruments the vector **z**, consisting of the elements of **x** that are
exogenous and additional variables not correlated with u_i. In a GMM
framework, we can write our moment conditions as

E{**z**_i u_i(**b**)} = E{**z**_i(y_i - **x**_i'**b**)} = 0

Our first attempt at a moment-evaluator program is

**program gmm_ivreg**

**version 15.1**

**syntax varlist [if] , at(name) rhs(varlist) depvar(varlist)**

**tempvar m**
**quietly gen double `m' = 0 `if'**
**local i 1**
**foreach var of varlist `rhs' {**
**quietly replace `m' = `m' + `var'*`at'[1,`i'] `if'**
**local `++i'**
**}**
**quietly replace `m' = `m' + `at'[1,`i'] `if' // constant**

**quietly replace `varlist' = `depvar' - `m' `if'**

**end**

Say that our dependent variable, y_i, is **mpg**; **x** consists of **gear_ratio**,
**turn**, and a constant; and **z** consists of **gear_ratio**, **length**, **headroom**, and
a constant. Then, to fit our model, we would type

**. gmm gmm_ivreg, nequations(1) nparameters(3)**
**instruments(gear_ratio length headroom)** **depvar(mpg)**
**rhs(gear_ratio turn)**

First, notice that **depvar()** and **rhs()** are not options that the **gmm**
command recognizes. Therefore, **gmm** will pass those options to our
moment-evaluator program.

Our moment-evaluator program accepts a *varlist*. **gmm** will pass to our
program *q* variables in this *varlist*, where *q* is the number of moment
equations specified in the **nequations()** or **equations()** option. Because,
in our command, we specified **nequations(1)**, the *varlist* will contain one
variable, which we are to fill in with our single moment equation u_i(**b**)
= y_i - **x**_i'**b**.

The parameter vector at which we are to evaluate our moments is passed in
the required **at()** option; all moment-evaluator programs must accept this
option. In our calling command, we specified **nparameters(3)**, so the **`at'**
vector passed to our program will be 1 x 3.

We wrote our moment-evaluator program to also accept the **depvar()** and
**rhs()** options. That way, we can fit other regression models with
endogenous regressors simply by changing the variables we specify in
those options and the **instruments()** option. Unlike commands such as
**ivregress** designed specifically for linear regression with endogenous
regressors, with **gmm** we must specify the complete instrument list,
including exogenous regressors, in the **instruments()** option.

Our program also accepts an **if** condition because that is how **gmm**
communicates the estimation sample. For all the commands that operate on
variables, we include the expression **`if'** to restrict their operations to
the estimation sample.

The method we just explained can be used to fit an arbitrary GMM model.
When some of the moments are linear in the parameters, we can instead
specify full equation names in the **parameters()** option and use **matrix**
**score** to compute linear combinations of variables rather than having to
loop through each variable as in our previous program. Thus we can write
the moment-evaluator program for our example as follows:

**program gmm_ivreg_2**

**version 15.1**
**syntax varlist [if] , at(name) depvar(varlist)**
**tempvar xb**
**matrix score double `xb' = `at' `if', eq(#1)**
**quietly replace `varlist' = `depvar' - `xb' `if'**
**end**

Now, to fit our model, we type

**. gmm gmm_ivreg_2, nequations(1) depvar(mpg)**
**parameters({mpg:gear_ratio turn _cons})**
**instruments(gear_ratio length headroom)**

Because we specify full equation and variable names for each parameter,
the columns of the **`at'** vector passed to our program will be labeled so
that **matrix score** can compute the linear combination, and we no longer
need to include an option to pass the variable names into our program.
For simplicity, we included an option to specify the dependent variable,
but we could have used Stata's extended macro functions to obtain it from
the **`at'** vector as well.

__Substitutable expressions__

You use substitutable expressions with the interactive and programmed
substitutable-expression versions of **gmm** to define your system of
equations. Substitutable expressions are just like any other
mathematical expression in Stata, except that the parameters of your
model are bound in braces.

You specify a substitutable expression for each equation in your system,
and you must follow three rules:

1. Parameters of the model are bound in curly braces: **{b0}**,
**{param}**, etc. Parameter names must follow the same conventions
as variable names; see **[U] 11.3 Naming conventions**.

2. Initial values for parameters are given by including an equal
sign and the initial value inside the curly braces: **{b0=1}**,
**{param=3.571}**, etc.

You can also specify initial values by using the **from()** option.
Initial values specified in **from()** override whatever initial
values are given within the substitutable expression. If you do
not specify an initial value for a parameter, it is initialized
to 0.

3. Linear combinations of variables can be included using the
notation **{***lc***:***varlist***}**: **{xb: mpg price weight _cons**}, **{score: w x**
**z}**, etc. Parameters of linear combinations are initialized to 0.

Substitutable expressions can include any mathematical expression
involving scalars and variables. See operator and exp for more
information on expressions.

__Examples__

Simple linear regression
**. sysuse auto**
**. regress mpg gear_ratio turn**
**. gmm (mpg - {b1}*gear_ratio - {b2}*turn - {b0}),**
**instruments(gear_ratio turn)**

Same as above, with analytic derivatives
**. gmm (mpg - {b1}*gear_ratio - {b2}*turn - {b0}),**
**instruments(gear_ratio turn)** **derivative(/b1 = -1*gear_ratio)**
**derivative(/b2 = -1*turn)** **derivative(/b0 = -1)**

Simple linear regression, using a linear combination
**. gmm (mpg - {xb:gear_ratio turn} - {b0}), instruments(gear_ratio**
**turn)**

Same as above, with analytic derivatives
**. gmm (mpg - {xb:gear_ratio turn} - {b0}), instruments(gear_ratio**
**turn)** **derivative(/xb = -1) derivative(/b0 = -1)**

Two-stage least squares (same as **ivregress 2sls**)
**. ivregress 2sls mpg gear_ratio (turn = weight length headroom)**
**. gmm (mpg - {b1}*turn - {b2}*gear_ratio - {b0}),**
**instruments(gear_ratio weight length headroom) onestep**

Two-step GMM estimation (same as **ivregress gmm**)
**. ivregress gmm mpg gear_ratio (turn = weight length headroom)**
**. gmm (mpg - {b1}*turn - {b2}*gear_ratio - {b0}),**
**instruments(gear_ratio weight length headroom)** **wmatrix(robust)**

Estimation of the parameters of the gamma distribution (Greene 2018, 493)
**. webuse greenegamma**
**. gmm (y - {P}/{lambda})**
**(y^2 - {P}*({P}+1)/{lambda}^2)**
**(ln(y) - digamma({P}) + ln({lambda}))**
**(1/y - {lambda}/({P}-1)),**
**from(P 2.41 lambda 0.08) winitial(identity)**

Same as above, with analytic derivatives
**. gmm (y - {P}/{lambda})**
**(y^2 - {P}*({P}+1)/{lambda}^2)**
**(ln(y) - digamma({P}) + ln({lambda}))**
**(1/y - {lambda}/({P}-1)),**
**from(P 2.41 lambda 0.08)**
**winitial(identity)**
**deriv(1/P = -1/{lambda})**
**deriv(2/P = -(2*{P}+1)/{lambda}^2)**
**deriv(3/P = -1*trigamma({P}))**
**deriv(4/P = {lambda}/({P}-1)^2)**
**deriv(1/lambda = {P}/{lambda}^2)**
**deriv(2/lambda = 2*{P}*({P}+1)/{lambda}^3)**
**deriv(3/lambda = 1/{lambda})**
**deriv(4/lambda = -1/({P}-1))**

Estimation of a consumption CAPM model with one financial asset, using
first and second lags of consumption growth and two lags of returns as
instruments (Hamilton 1994, sec. 14.2)
**. webuse cr**
**. generate clc = c / L.c**
**. generate lcllc = L.c / L2.c**
**. gmm (1 - {b=1}*(1+F.r)*(F.c/c)^(-1*{g})),** **inst(clc lcllc r L.r**
**L2.r)**

Exponential (Poisson) regression with endogenous regressor **income**
**. webuse docvisits, clear**
**. gmm (docvis - exp({xb:private chronic female income} + {b0})),**
**instruments(private chronic female age black hispanic) onestep**

Same as above, specifying analytic derivatives and using the two-step
estimator
**. gmm (docvis - exp({xb:private chronic female income} + {b0})),**
**instruments(private chronic female age black hispanic)** **deriv(/xb**
**= -1*exp({xb:} + {b0}))** **deriv(/b0 = -1*exp({xb:} + {b0})) twostep**

Using **gmm** to fit a maximum likelihood model (probit)
**. webuse probitgmm**
**. global Phi "normal({b0}+{b1}*x)"**
**. global phi "normalden({b0}+{b1}*x)"**
**. gmm (y*$phi/$Phi - (1-y)*$phi/(1-$Phi))** **( (y*$phi/$Phi -**
**(1-y)*$phi/(1-$Phi))*x)** **winitial(identity) onestep**

Using **gmm** to fit a nonlinear least-squares model (probit)
**. global Phi "normal({b0}+{b1}*x)"**
**. global phi "normalden({b0}+{b1}*x)"**
**. gmm ( (y - $Phi)*(-x*$phi) )** **( (y - $Phi)*(-1*$phi) )**
**winitial(identity) onestep**
**. nl (y = $Phi)**

Using **gmm** to fit a dynamic panel-data model
**. webuse abdata**
**. xtdpdsys n L(0/1).w, lags(1) twostep**
**. gmm (n - {rho}*L.n - {w}*w - {lagw}*L.w - {c})**
**(D.n - {rho}*LD.n - {w}*D.w - {lagw}*LD.w),**
**xtinstruments(1:D.n, lags(1/1))**
**xtinstruments(2:n, lags(2/.))**
**instruments(2:D.w LD.w, noconstant)**
**deriv(1/rho = -1*L.n)**
**deriv(1/w = -1*w)**
**deriv(1/lagw = -1*L.w)**
**deriv(1/c = -1)**
**deriv(2/rho = -1*LD.n)**
**deriv(2/w = -1*D.w)**
**deriv(2/lagw = -1*LD.w)**
**winitial(xt LD) wmatrix(robust) vce(unadjusted)**
**variables(L.n w L.w)**
**twostep nocommonesample**

Using **gmm** to fit a dynamic panel-data model with predetermined
coterminous regressor **k**
**. xtdpdsys n L(0/1).w, pre(k) lags(1) twostep**
**. gmm (n - {rho}*L.n - {k}*k - {w}*w - {lagw}*L.w - {c})**
**(D.n - {rho}*LD.n - {k}*D.k - {w}*D.w - {lagw}*LD.w),**
**xtinstruments(1:D.n, lags(1/1))**
**xtinstruments(1:D.k, lags(0/0))**
**xtinstruments(2:n, lags(2/.))**
**xtinstruments(2:k, lags(1/.))**
**instruments(2:D.w LD.w, noconstant)**
**deriv(1/rho = -1*L.n)**
**deriv(1/k = -1*k)**
**deriv(1/w = -1*w)**
**deriv(1/lagw = -1*L.w)**
**deriv(1/c = -1)**
**deriv(2/rho = -1*LD.n)**
**deriv(2/k = -1*D.k)**
**deriv(2/w = -1*D.w)**
**deriv(2/lagw = -1*LD.w)**
**winitial(xt LD) wmatrix(robust) vce(unadjusted)**
**variables(L.n w L.w)**
**twostep nocommonesample**

__Stored results__

**gmm** stores the following in **e()**:

Scalars
**e(N)** number of observations
**e(k)** number of parameters
**e(k_eq)** number of equations in **e(b)**
**e(k_eq_model)** number of equations in overall model test
**e(k_aux)** number of auxiliary parameters
**e(n_moments)** number of moments
**e(n_eq)** number of equations in moment-evaluator program
**e(Q)** criterion function
**e(J)** Hansen *J* chi-squared statistic
**e(J_df)** *J* statistic degrees of freedom
**e(k_***i***)** number of parameters in equation *i*
**e(has_xtinst)** **1** if panel-style instruments specified, **0** otherwise
**e(N_clust)** number of clusters
**e(type)** **1** if interactive version, **2** if moment-evaluator
program version
**e(rank)** rank of **e(V)**
**e(ic)** number of iterations used by iterative GMM
estimator
**e(converged)** **1** if converged, **0** otherwise

Macros
**e(cmd)** **gmm**
**e(cmdline)** command as typed
**e(title)** title specified in **title()**
**e(title_2)** title specified in **title2()**
**e(clustvar)** name of cluster variable
**e(inst_***i***)** equation *i* instruments
**e(eqnames)** equation names
**e(winit)** initial weight matrix used
**e(winitname)** name of user-supplied initial weight matrix
**e(estimator)** **onestep**, **twostep**, or **igmm**
**e(rhs)** variables specified in **variables()**
**e(params_***i***)** equation *i* parameters
**e(wmatrix)** *wmtype* specified in **wmatrix()**
**e(vce)** *vcetype* specified in **vce()**
**e(vcetype)** title used to label Std. Err.
**e(params)** parameter names
**e(sexp_***i***)** substitutable expression for equation *i*
**e(evalprog)** moment-evaluator program
**e(evalopts)** options passed to moment-evaluator program
**e(nocommonesample)** **nocommonesample**, if specified
**e(technique)** optimization technique
**e(properties)** **b V**
**e(estat_cmd)** program used to implement **estat**
**e(predict)** program used to implement **predict**
**e(marginsok)** predictions allowed by **margins**
**e(marginsnotok)** predictions disallowed by **margins**
**e(marginsprop)** signals to the **margins** command
**e(asbalanced)** factor variables **fvset** as **asbalanced**
**e(asobserved)** factor variables **fvset** as **asobserved**

Matrices
**e(b)** coefficient vector
**e(init)** initial values of the estimators
**e(Wuser)** user-supplied initial weight matrix
**e(W)** weight matrix used for final round of estimation
**e(S)** moment covariance matrix used in robust VCE
computations
**e(G)** averages of derivatives of moment conditions
**e(N_byequation)** number of observations per equation, if
**nocommonesample** specified
**e(V)** variance-covariance matrix
**e(V_modelbased)** model-based variance

Functions
**e(sample)** marks estimation sample

__References__

Greene, W. H. 2018. *Econometric Analysis*. 8th ed. New York: Pearson.

Hall, A. R. 2005. *Generalized Method of Moments*. Oxford: Oxford
University Press.

Hamilton, J. D. 1994. *Time Series Analysis*. Princeton: Princeton
University Press.

Newey, W. K., and K. D. West. 1994 Automatic lag selection in covariance
matrix estimation. *Review of Economic Studies* 61: 631-653.