**[R] ivregress** -- Single-equation instrumental-variables regression

__Syntax__

**ivregress** *estimator* *depvar* [*varlist1*] **(***varlist2* **=** *varlist_iv***)** [*if*] [
*in*] [*weight*] [**,** *options*]

*varlist1* is the list of exogenous variables.

*varlist2* is the list of endogenous variables.

*varlist_iv* is the list of exogenous variables used with *varlist1* as
instruments for *varlist2*.

*estimator* Description
-------------------------------------------------------------------------
**2sls** two-stage least squares (2SLS)
**liml** limited-information maximum likelihood (LIML)
**gmm** generalized method of moments (GMM)
-------------------------------------------------------------------------

*options* Description
-------------------------------------------------------------------------
Model
__nocons__**tant** suppress constant term
__h__**ascons** has user-supplied constant

# GMM
__wmat__**rix(***wmtype***)** *wmtype* may be __r__**obust**, __cl__**uster** *clustvar*, **hac**
*kernel*, or __un__**adjusted**
__c__**enter** center moments in weight matrix computation
__i__**gmm** use iterative instead of two-step GMM estimator
* **eps(***#***)** specify # for parameter convergence criterion;
default is **eps(1e-6)**
* **weps(***#***)** specify # for weight matrix convergence
criterion; default is **weps(1e-6)**
* *optimization_options* control the optimization process; seldom used

SE/Robust
**vce(***vcetype***)** *vcetype* may be __un__**adjusted**, __r__**obust**, __cl__**uster**
*clustvar*, __boot__**strap**, __jack__**knife**, or **hac** *kernel*

Reporting
__l__**evel(***#***)** set confidence level; default is **level(95)**
**first** report first-stage regression
**small** make degrees-of-freedom adjustments and report
small-sample statistics
__nohe__**ader** display only the coefficient table
__dep__**name(***depname***)** substitute dependent variable name
__ef__**orm(***string***)** report exponentiated coefficients and use *string*
to label them
*display_options* control columns and column formats, row spacing,
line width, display of omitted variables and
base and empty cells, and factor-variable
labeling

__per__**fect** do not check for collinearity between endogenous
regressors and excluded instruments
__coefl__**egend** display legend instead of statistics
-------------------------------------------------------------------------
# These options may be specified only when **gmm** is specified.
* These options may be specified only when **igmm** is specified.
*varlist1*, *varlist2*, and *varlist_iv* may contain factor variables; see
fvvarlist.
*depvar*, *varlist1*, *varlist2*, and *varlist_iv* may contain time-series
operators; see tsvarlist.
**bootstrap**, **by**, **fmm**, **jackknife**, **rolling**, **statsby**, and **svy** are allowed; see
prefix. For more details, see **[FMM] fmm ivregress**.
Weights are not allowed with the **bootstrap** prefix.
**aweight**s are not allowed with the **jackknife** prefix.
**hascons**, **vce()**, **noheader**, **depname()**, and weights are not allowed with the
**svy** prefix.
**aweight**s, **fweight**s, **iweight**s, and **pweight**s are allowed; see weight.
**perfect** and **coeflegend** do not appear in the dialog box.
See **[R] ivregress postestimation** for features available after estimation.

__Menu__

**Statistics > Endogenous covariates >** **Linear regression with endogenous**
**covariates**

__Description__

**ivregress** fits linear models where one or more of the regressors are
endogenously determined. **ivregress** supports estimation via two-stage
least squares (2SLS), limited-information maximum likelihood (LIML), and
generalized method of moments (GMM).

__Options__

+-------+
----+ Model +------------------------------------------------------------

**noconstant**; see **[R] estimation options**.

**hascons** indicates that a user-defined constant or its equivalent is
specified among the independent variables.

+-----+
----+ GMM +--------------------------------------------------------------

**wmatrix(***wmtype***)** specifies the type of weighting matrix to be used in
conjunction with the GMM estimator.

Specifying **wmatrix(robust)** requests a weighting matrix that is
optimal when the error term is heteroskedastic. **wmatrix(robust)** is
the default.

Specifying **wmatrix(cluster** *clustvar***)** requests a weighting matrix that
accounts for arbitrary correlation among observations within clusters
identified by *clustvar*.

Specifying **wmatrix(hac** *kernel* *#***)** requests a heteroskedasticity- and
autocorrelation-consistent (HAC) weighting matrix using the specified
kernel (see below) with *#* lags. The bandwidth of a kernel is equal
to *#* + 1.

Specifying **wmatrix(hac** *kernel* **opt** [*#*]**)** requests an HAC weighting
matrix using the specified kernel, and the lag order is selected
using Newey and West's (1994) optimal lag-selection algorithm. *#* is
an optional tuning parameter that affects the lag order selected; see
the discussion in **[R] ivregress**.

Specifying **wmatrix(hac** *kernel***)** requests an HAC weighting matrix using
the specified kernel and *N*-2 lags, where *N* is the sample size.

There are three kernels available for HAC weighting matrices, and you
may request each one by using the name used by statisticians or the
name perhaps more familiar to economists:

__ba__**rtlett** or __nw__**est** requests the Bartlett (Newey-West) kernel;

__pa__**rzen** or __ga__**llant** requests the Parzen (Gallant 1987) kernel;
and

__qu__**adraticspectral** or __an__**drews** requests the quadratic spectral
(Andrews 1991) kernel.

Specifying **wmatrix(unadjusted)** requests a weighting matrix that is
suitable when the errors are homoskedastic. The GMM estimator with
this weighting matrix is equivalent to the 2SLS estimator.

**center** requests that the sample moments be centered (demeaned) when
computing GMM weight matrices. By default, centering is not done.

**igmm** requests that the iterative GMM estimator be used instead of the
default two-step GMM estimator. Convergence is declared when the
relative change in the parameter vector from one iteration to the
next is less than **eps()** or the relative change in the weight matrix
is less than **weps()**.

**eps(***#***)** specifies the convergence criterion for successive parameter
estimates when the iterative GMM estimator is used. The default is
**eps(1e-6)**. Convergence is declared when the relative difference
between successive parameter estimates is less than **eps()** and the
relative difference between successive estimates of the weighting
matrix is less than **weps()**.

**weps(***#***)** specifies the convergence criterion for successive estimates of
the weighting matrix when the iterative GMM estimator is used. The
default is **weps(1e-6)**. Convergence is declared when the relative
difference between successive parameter estimates is less than **eps()**
and the relative difference between successive estimates of the
weighting matrix is less than **weps()**.

*optimization_options*: __iter__**ate()**, [__no__]__lo__**g**. **iterate()** specifies the
maximum number of iterations to perform in conjunction with the
iterative GMM estimator. The default is 16,000 or the number set
using **set maxiter**. **log**/**nolog** specifies whether to show the iteration
log. These options are seldom used.

+-----------+
----+ SE/Robust +--------------------------------------------------------

**vce(***vcetype***)** specifies the type of standard error reported, which
includes types that are robust to some kinds of misspecification
(**robust**), that allow for intragroup correlation (**cluster** *clustvar*),
and that use bootstrap or jackknife methods (**bootstrap**, **jackknife**);
see **[R] ***vce_option*.

**vce(unadjusted)**, the default for **2sls** and **liml**, specifies that an
unadjusted (nonrobust) VCE matrix be used. The default for **gmm** is
based on the *wmtype* specified in the **wmatrix()** option; see
wmatrix(*wmtype*) above. If **wmatrix()** is specified with **gmm** but **vce()**
is not, then *vcetype* is set equal to *wmtype*. To override this
behavior and obtain an unadjusted (nonrobust) VCE matrix, specify
**vce(unadjusted)**.

**ivregress** also allows the following:

**vce(hac** *kernel* [*#* | **opt** [*#*]]**)** specifies that an HAC covariance matrix
be used. The syntax used with **vce(hac ***kernel ...***)** is identical
to that used with **wmatrix(hac ***kernel ...* **)**; see wmatrix(*wmtype*)
above.

+-----------+
----+ Reporting +--------------------------------------------------------

**level(***#***)**; see **[R] estimation options**.

**first** requests that the first-stage regression results be displayed.

**small** requests that the degrees-of-freedom adjustment *N*/(*N*-*k*) be made to
the variance-covariance matrix of parameters and that small-sample *F*
and *t* statistics be reported, where *N* is the sample size and *k* is the
number of parameters estimated. By default, no degrees-of-freedom
adjustment is made, and Wald and *z* statistics are reported. Even with
this option, no degrees-of-freedom adjustment is made to the
weighting matrix when the GMM estimator is used.

**noheader** suppresses the display of the summary statistics at the top of
the output, displaying only the coefficient table.

**depname(***depname***)** is used only in programs and ado-files that use
**ivregress** to fit models other than instrumental-variables regression.
**depname()** may be specified only at estimation time. *depname* is
recorded as the identity of the dependent variable, even though the
estimates are calculated using *depvar*. This method affects the
labeling of the output -- not the results calculated -- but could
affect later calculations made by **predict**, where the residual would
be calculated as deviations from *depname* rather than *depvar*.
**depname()** is most typically used when *depvar* is a temporary variable
(see **[P] macro**) used as a proxy for *depname*.

**eform(***string***)** is used only in programs and ado-files that use **ivregress**
to fit models other than instrumental-variables regression. **eform()**
specifies that the coefficient table be displayed in "exponentiated
form", as defined in **[R] maximize**, and that *string* be used to label
the exponentiated coefficients in the table.

*display_options*: **noci**, __nopv__**alues**, __noomit__**ted**, **vsquish**, __noempty__**cells**,
__base__**levels**, __allbase__**levels**, __nofvlab__**el**, **fvwrap(***#***)**, **fvwrapon(***style***)**,
**cformat(***%fmt***)**, **pformat(%***fmt***)**, **sformat(%***fmt***)**, and **nolstretch**; see **[R]**
**estimation options**.

The following options are available with **ivregress** but are not shown in
the dialog box:

**perfect** requests that **ivregress** not check for collinearity between the
endogenous regressors and excluded instruments, allowing one to
specify "perfect" instruments. This option cannot be used with the
LIML estimator. This option may be required when using **ivregress** to
implement other estimators.

**coeflegend**; see **[R] estimation options**.

__Examples__

Setup
**. webuse hsng2**

Fit a regression via 2SLS, requesting small-sample statistics
**. ivregress 2sls rent pcturban (hsngval = faminc i.region), small**

Fit a regression using the LIML estimator
**. ivregress liml rent pcturban (hsngval = faminc i.region)**

Fit a regression via GMM using the default heteroskedasticity-robust
weight matrix
**. ivregress gmm rent pcturban (hsngval = faminc i.region)**

Fit a regression via GMM using a heteroskedasticity-robust weight matrix,
requesting nonrobust standard errors
**. ivregress gmm rent pcturban (hsngval = faminc i.region),**
**vce(unadjusted)**

Fit a regression via 2SLS, with an endogenous factorial interaction
**. ivregress 2sls rent pcturban (c.popgrow##c.popgrow =**
**c.faminc##c.faminc i.region)**

__Video example__

Instrumental variables regression using Stata

__Stored results__

**ivregress** stores the following in **e()**:

Scalars
**e(N)** number of observations
**e(mss)** model sum of squares
**e(df_m)** model degrees of freedom
**e(rss)** residual sum of squares
**e(df_r)** residual degrees of freedom
**e(r2)** R-squared
**e(r2_a)** adjusted R-squared
**e(F)** F statistic
**e(rmse)** root mean squared error
**e(N_clust)** number of clusters
**e(chi2)** chi-squared
**e(kappa)** kappa used in LIML estimator
**e(J)** value of GMM objective function
**e(wlagopt)** lags used in HAC weight matrix (if Newey-West
algorithm used)
**e(vcelagopt)** lags used in HAC VCE matrix (if Newey-West
algorithm used)
**e(hac_lag)** HAC lag
**e(rank)** rank of **e(V)**
**e(iterations)** number of GMM iterations (**0** if not applicable)

Macros
**e(cmd)** **ivregress**
**e(cmdline)** command as typed
**e(depvar)** name of dependent variable
**e(instd)** instrumented variable
**e(insts)** instruments
**e(constant)** **noconstant** or **hasconstant** if specified
**e(wtype)** weight type
**e(wexp)** weight expression
**e(title)** title in estimation output
**e(clustvar)** name of cluster variable
**e(hac_kernel)** HAC kernel
**e(vce)** *vcetype* specified in **vce()**
**e(vcetype)** title used to label Std. Err.
**e(estimator)** **2sls**, **liml**, or **gmm**
**e(exogr)** exogenous regressors
**e(wmatrix)** *wmtype* specified in **wmatrix()**
**e(moments)** **centered** if **center** specified
**e(small)** **small** if small-sample statistics
**e(properties)** **b V**
**e(estat_cmd)** program used to implement **estat**
**e(predict)** program used to implement **predict**
**e(footnote)** program used to implement footnote display
**e(marginsok)** predictions allowed by **margins**
**e(marginsnotok)** predictions disallowed by **margins**
**e(asbalanced)** factor variables **fvset** as **asbalanced**
**e(asobserved)** factor variables **fvset** as **asobserved**

Matrices
**e(b)** coefficient vector
**e(W)** weight matrix used to compute GMM estimates
**e(S)** moment covariance matrix used to compute GMM
variance-covariance matrix
**e(V)** variance-covariance matrix of the estimators
**e(V_modelbased)** model-based variance

Functions
**e(sample)** marks estimation sample

__References__

Andrews, D. W. K. 1991. Heteroskedasticity and autocorrelation consistent
covariance matrix estimation. *Econometrics* 59: 817-858.

Gallant, A. R. 1987. *Nonlinear Statistical Models*. New York: Wiley.

Newey, W. K., and K. D. West. 1994. Automatic lag selection in covariance
matrix estimation. *Review of Economic Studies* 61: 631-653.