**[R] reg3** -- Three-stage estimation for systems of simultaneous equations

__Syntax__

Basic syntax

**reg3** **(***depvar1* *varlist1***)** **(***depvar2* *varlist2***)** *...***(***depvarN* *varlistN***)** [*if*]
[*in*] [*weight*]

Full syntax

**reg3** **(**[*eqname1***:**]*depvar1a* [*depvar1b* *...***=**]*varlist1* [**,** __nocons__**tant**]**)**
**(**[*eqname2***:**]*depvar2a* [*depvar2b* *...***=**]*varlist2* [**,** __nocons__**tant**]**)**
*...*
**(**[*eqnameN***:**]*depvarNa* [*depvarNb* *...***=**]*varlistN* [**,** __nocons__**tant**]**)**
[*if*] [*in*] [*weight*] [**,** *options*]

*options* Description
-------------------------------------------------------------------------
Model
__ir__**eg3** iterate until estimates converge
__c__**onstraints(***constraints***)** apply specified linear constraints

Model 2
__ex__**og(***varlist***)** exogenous variables not specified in system
equations
__en__**dog(***varlist***)** additional right-hand-side endogenous
variables
__in__**st(***varlist***)** full list of exogenous variables
__a__**llexog** all right-hand-side variables are exogenous
__nocons__**tant** suppress constant from instrument list

Est. method
**3sls** three-stage least squares; the default
**2sls** two-stage least squares
__o__**ls** ordinary least squares (OLS)
__su__**re** seemingly unrelated regression estimation
(SURE)
__m__**vreg** **sure** with OLS degrees-of-freedom adjustment
__cor__**r(***correlation***)** __u__**nstructured** or __i__**ndependent** correlation
structure; default is **unstructured**

df adj.
__sm__**all** report small-sample statistics
**dfk** use small-sample adjustment
**dfk2** use alternate adjustment

Reporting
__l__**evel(***#***)** set confidence level; default is **level(95)**
__f__**irst** report first-stage regression
__nocnsr__**eport** do not display constraints
*display_options* control columns and column formats, row
spacing, line width, display of omitted
variables and base and empty cells, and
factor-variable labeling

Optimization
*optimization_options* control the optimization process; seldom used

__noh__**eader** suppress display of header
__not__**able** suppress display of coefficient table
__nofo__**oter** suppress display of footer
__coefl__**egend** display legend instead of statistics
-------------------------------------------------------------------------
*varlist1*, ..., *varlistN* and the **exog()** and the **inst()** varlist may contain
factor variables; see fvvarlist. You must have the same levels of
factor variables in all equations that have factor variables.
*depvar* and *varlist* may contain time-series operators; see tsvarlist.
**bootstrap**, **by**, **fp**, **jackknife**, **rolling**, and **statsby** are allowed; see
prefix.
Weights are not allowed with the **bootstrap** prefix.
**aweight**s are not allowed with the **jackknife** prefix.
**aweight**s and **fweight**s are allowed, see weight.
**noheader**, **notable**, **nofooter**, and **coeflegend** do not appear in the dialog
box.
See **[R] reg3 postestimation** for features available after estimation.

Explicit equation naming (*eqname***:**) cannot be combined with multiple
dependent variables in an equation specification.

__Menu__

**Statistics > Endogenous covariates > Three-stage least squares**

__Description__

**reg3** estimates a system of structural equations, where some equations
contain endogenous variables among the explanatory variables. Estimation
is via three-stage least squares (3SLS); see Zellner and Theil (1962).
Typically, the endogenous explanatory variables are dependent variables
from other equations in the system. **reg3** supports iterated GLS
estimation and linear constraints.

**reg3** can also estimate systems of equations by seemingly unrelated
regression estimation (SURE), multivariate regression (MVREG), and
equation-by-equation ordinary least squares (OLS) or two-stage least
squares (2SLS).

__Nomenclature__

Under 3SLS or 2SLS estimation, a structural equation is defined as one of
the equations specified in the system. A dependent variable will have
its usual interpretation as the left-hand-side variable in an equation
with an associated disturbance term. All dependent variables are
explicitly taken to be endogenous to the system and are treated as
correlated with the disturbances in the system's equations. Unless
specified in an **endog()** option, all other variables in the system are
treated as exogenous to the system and uncorrelated with the
disturbances. The exogenous variables are taken to be instruments for
the endogenous variables.

__Options__

+-------+
----+ Model +------------------------------------------------------------

**ireg3** causes **reg3** to iterate over the estimated disturbance covariance
matrix and parameter estimates until the parameter estimates
converge. Although the iteration is usually successful, there is no
guarantee that it will converge to a stable point. Under SURE, this
iteration converges to the maximum likelihood estimates.

**constraints(***constraints***)**; see **[R] estimation options**.

+---------+
----+ Model 2 +----------------------------------------------------------

**exog(***varlist***)** specifies additional exogenous variables that are included
in none of the system equations. This can occur when the system
contains identities that are not estimated. If implicitly exogenous
variables from the equations are listed here, **reg3** will just ignore
the additional information. Specified variables will be added to the
exogenous variables in the system and used in the first stage as
instruments for the endogenous variables. By specifying dependent
variables from the structural equations, you can use **exog()** to
override their endogeneity.

**endog(***varlist***)** identifies variables in the system that are not dependent
variables but are endogenous to the system. These variables must
appear in the variable list of at least one equation in the system.
Again the need for this identification often occurs when the system
contains identities. For example, a variable that is the sum of an
exogenous variable and a dependent variable may appear as an
explanatory variable in some equations.

**inst(***varlist***)** specifies a full list of all exogenous variables and may
not be used with the **endog()** or **exog()** options. It must contain a
full list of variables to be used as instruments for the endogenous
regressors. Like **exog()**, the list may contain variables not
specified in the system of equations. This option can be used to
achieve the same results as the **endog()** and **exog()** options, and the
choice is a matter of convenience. Any variable not specified in the
*varlist* of the **inst()** option is assumed to be endogenous to the
system. As with **exog()**, including the dependent variables from the
structural equations will override their endogeneity.

**allexog** indicates that all right-hand-side variables are to be treated as
exogenous -- even if they appear as the dependent variable of another
equation in the system. This option can be used to enforce a SURE or
MVREG estimation even when some dependent variables appear as
regressors.

**noconstant**; see **[R] estimation options**.

+-------------+
----+ Est. method +------------------------------------------------------

**3sls** specifies the full 3SLS estimation of the system and is the default
for **reg3**.

**2sls** causes **reg3** to perform equation-by-equation 2SLS on the full system
of equations. This option implies **dfk**, **small**, and **corr(independent)**.

Cross-equation testing should not be performed after estimation with
this option. With **2sls**, no covariance is estimated between the
parameters of the equations. For cross-equation testing, use **3sls**.

**ols** causes **reg3** to perform equation-by-equation OLS on the system -- even
if dependent variables appear as regressors or the regressors differ
for each equation; see **[MV] mvreg**. **ols** implies **allexog**, **dfk**, **small**,
and **corr(independent)**; **nodfk** and **nosmall** may be specified to override
**dfk** and **small**.

The covariance of the coefficients between equations is not estimated
under this option, and cross-equation tests should not be performed
after estimation with **ols**. For cross-equation testing, use **sure** or
**3sls** (the default).

**sure** causes **reg3** to perform a SURE of the system -- even if dependent
variables from some equations appear as regressors in other
equations; see **[R] sureg**. **sure** is a synonym for **allexog**.

**mvreg** is identical to **sure**, except that the disturbance covariance matrix
is estimated with an OLS degrees-of-freedom adjustment -- the **dfk**
option. If the regressors are identical for all equations, the
parameter point estimates will be the standard MVREG results. If any
of the regressors differ, the point estimates are those for SURE with
an OLS degrees-of-freedom adjustment in computing the covariance
matrix. **nodfx** and **nosmall** may be specified to override **dfk** and
**small**.

**corr(***correlation***)** specifies the assumed form of the correlation structure
of the equation disturbances and is rarely requested explicitly. For
the family of models fit by **reg3**, the only two allowable correlation
structures are __u__**nstructured** and __i__**ndependent**. The default is
**unstructured**.

This option is used almost exclusively to estimate a system of
equations by 2SLS or to perform OLS regression with **reg3** on multiple
equations. In these cases, the correlation is set to **independent**,
forcing **reg3** to treat the covariance matrix of equation disturbances
as diagonal in estimating model parameters. Thus a set of two-stage
coefficient estimates can be obtained if the system contains
endogenous right-hand-side variables, or OLS regression can be
imposed, even if the regressors differ across equations. Without
imposing independent disturbances, **reg3** would estimate the former by
3SLS and the latter by SURE.

Any tests performed after estimation with the **independent** option will
treat coefficients in different equations as having no covariance;
cross-equation tests should not be used after specifying
**corr(independent)**.

+---------+
----+ df adj. +----------------------------------------------------------

**small** specifies that small-sample statistics be computed. It shifts the
test statistics from chi-squared and z statistics to F statistics and
t statistics. This option is intended primarily to support MVREG.
Although the standard errors from each equation are computed using
the degrees of freedom for the equation, the degrees of freedom for
the t statistics are all taken to be those for the first equation.
This approach poses no problem under MVREG because the regressors are
the same across equations.

**dfk** specifies the use of an alternative divisor in computing the
covariance matrix for the equation residuals. As an asymptotically
justified estimator, **reg3** by default uses the number of sample
observations n as a divisor. When the **dfk** option is set, a
small-sample adjustment is made, and the divisor is taken to be
sqrt((n - k_i) * (n - k_j)), where k_i and k_j are the number of
parameters in equations i and j, respectively.

**dfk2** specifies the use of an alternative divisor in computing the
covariance matrix for the equation errors. When the **dfk2** option is
set, the divisor is taken to be the mean of the residual degrees of
freedom from the individual equations.

+-----------+
----+ Reporting +--------------------------------------------------------

**level(***#***)**; see **[R] estimation options**.

**first** requests that the first-stage regression results be displayed
during estimation.

**nocnsreport**; see **[R] estimation options**.

*display_options*: **noci**, __nopv__**alues**, __noomit__**ted**, **vsquish**, __noempty__**cells**,
__base__**levels**, __allbase__**levels**, __nofvlab__**el**, **fvwrap(***#***)**, **fvwrapon(***style***)**,
**cformat(***%fmt***)**, **pformat(%***fmt***)**, **sformat(%***fmt***)**, and **nolstretch**; see **[R]**
**estimation options**.

+--------------+
----+ Optimization +-----------------------------------------------------

*optimization_options* control the iterative process that minimizes the sum
of squared errors when **ireg3** is specified. These options are seldom
used.

__iter__**ate(***#***)** specifies the maximum number of iterations. When the
number of iterations equals *#*, the optimizer stops and presents
the current results, even if the convergence tolerance has not
been reached. The default value of **iterate()** is the current
value of **set maxiter**, which is **iterate(16000)** if **maxiter** has not
been changed.

__tr__**ace** adds to the iteration log a display of the current parameter
vector.

**nolog** suppresses the display of the iteration log.

__tol__**erance(***#***)** specifies the tolerance for the coefficient vector.
When the relative change in the coefficient vector from one
iteration to the next is less than or equal to *#*, the
optimization process is stopped. **tolerance(1e-6)** is the default.

The following options are available with **reg3** but are not shown in the
dialog box:

**noheader** suppresses display of the header reporting the estimation method
and the table of equation summary statistics.

**notable** suppresses display of the coefficient table.

**nofooter** suppresses display of the footer reporting the list of
endogenous and exogenous variables in the model.

**coeflegend**; see **[R] estimation options**.

__Examples__

---------------------------------------------------------------------------
Setup
**. webuse klein**

Estimate system by three-stage least squares
**. reg3 (consump wagepriv wagegovt) (wagepriv consump govt capital1)**

---------------------------------------------------------------------------
Setup
**. webuse supDem**

Store equations in global macros
**. global demand "(qDemand: quantity price pcompete income)"**
**. global supply "(qSupply: quantity price praw)"**

Estimate system, specifying **price** as endogenous
**. reg3 $demand $supply, endog(price)**

---------------------------------------------------------------------------
Setup
**. webuse klein**

Store equations and variable lists in global macros
**. global conseqn "(consump profits profits1 wagetot)"**
**. global inveqn "(invest profits profits1 capital1)"**
**. global wageqn "(wagepriv totinc totinc1 year)"**
**. global enlist "wagetot profits totinc"**
**. global exlist "taxnetx wagegovt govt"**

Estimate system, specifying lists of endogenous and exogenous variables;
iterate until estimates converge
**. reg3 $conseqn $inveqn $wageqn, endog($enlist) exog($exlist) ireg3**

Modify consumption equation
**. global conseqn "(consump profits profits1 wagepriv wagegovt)"**

Constrain coefficients of **wagepriv** and **wagegovt** in consumption equation
to be equal
**. constraint 1 [consump]wagepriv = [consump]wagegovt**

Estimate system under constraint
**. reg3 $conseqn $inveqn $wageqn, endog($enlist) exog($exlist)**
**constr(1) ireg3**

---------------------------------------------------------------------------

__Stored results__

**reg3** stores the following in **e()**:

Scalars
**e(N)** number of observations
**e(k)** number of parameters
**e(k_eq)** number of equations in **e(b)**
**e(mss_***#***)** model sum of squares for equation *#*
**e(df_m***#***)** model degrees of freedom for equation *#*
**e(rss_***#***)** residual sum of squares for equation *#*
**e(df_r)** residual degrees of freedom (**small**)
**e(r2_***#***)** R-squared for equation *#*
**e(F_***#***)** F statistic for equation *#* (**small**)
**e(rmse_***#***)** root mean squared error for equation *#*
**e(dfk2_adj)** divisor used with VCE when **dfk2** specified
**e(ll)** log likelihood
**e(chi2_***#***)** chi-squared for equation *#*
**e(p_***#***)** p-value for model test for equation *#*
**e(cons_***#***)** **1** when equation *#* has a constant, **0** otherwise
**e(rank)** rank of **e(V)**
**e(ic)** number of iterations

Macros
**e(cmd)** **reg3**
**e(cmdline)** command as typed
**e(depvar)** names of dependent variables
**e(exog)** names of exogenous variables
**e(endog)** names of endogenous variables
**e(eqnames)** names of equations
**e(corr)** correlation structure
**e(wtype)** weight type
**e(wexp)** weight expression
**e(method)** **3sls**, **2sls**, **ols**, **sure**, or **mvreg**
**e(small)** **small**, if specified
**e(dfk)** **dfk**, if specified
**e(properties)** **b V**
**e(predict)** program used to implement **predict**
**e(marginsok)** predictions allowed by **margins**
**e(marginsnotok)** predictions disallowed by **margins**
**e(marginsdefault)** default **predict()** specification for **margins**
**e(asbalanced)** factor variables **fvset** as **asbalanced**
**e(asobserved)** factor variables **fvset** as **asobserved**

Matrices
**e(b)** coefficient vector
**e(Cns)** constraints matrix
**e(Sigma)** Sigma hat matrix
**e(V)** variance-covariance matrix of the estimators

Functions
**e(sample)** marks estimation sample

__Reference__

Zellner, A., and H. Theil. 1962. Three stage least squares: Simultaneous
estimate of simultaneous equations. *Econometrica* 29: 54-78.