Stata 15 help for reg3

[R] reg3 -- Three-stage estimation for systems of simultaneous equations


Basic syntax

reg3 (depvar1 varlist1) (depvar2 varlist2) ...(depvarN varlistN) [if] [in] [weight]

Full syntax

reg3 ([eqname1:]depvar1a [depvar1b ...=]varlist1 [, noconstant]) ([eqname2:]depvar2a [depvar2b ...=]varlist2 [, noconstant]) ... ([eqnameN:]depvarNa [depvarNb ...=]varlistN [, noconstant]) [if] [in] [weight] [, options]

options Description ------------------------------------------------------------------------- Model ireg3 iterate until estimates converge constraints(constraints) apply specified linear constraints

Model 2 exog(varlist) exogenous variables not specified in system equations endog(varlist) additional right-hand-side endogenous variables inst(varlist) full list of exogenous variables allexog all right-hand-side variables are exogenous noconstant suppress constant from instrument list

Est. method 3sls three-stage least squares; the default 2sls two-stage least squares ols ordinary least squares (OLS) sure seemingly unrelated regression estimation (SURE) mvreg sure with OLS degrees-of-freedom adjustment corr(correlation) unstructured or independent correlation structure; default is unstructured

df adj. small report small-sample statistics dfk use small-sample adjustment dfk2 use alternate adjustment

Reporting level(#) set confidence level; default is level(95) first report first-stage regression nocnsreport do not display constraints display_options control columns and column formats, row spacing, line width, display of omitted variables and base and empty cells, and factor-variable labeling

Optimization optimization_options control the optimization process; seldom used

noheader suppress display of header notable suppress display of coefficient table nofooter suppress display of footer coeflegend display legend instead of statistics ------------------------------------------------------------------------- varlist1, ..., varlistN and the exog() and the inst() varlist may contain factor variables; see fvvarlist. You must have the same levels of factor variables in all equations that have factor variables. depvar and varlist may contain time-series operators; see tsvarlist. bootstrap, by, fp, jackknife, rolling, and statsby are allowed; see prefix. Weights are not allowed with the bootstrap prefix. aweights are not allowed with the jackknife prefix. aweights and fweights are allowed, see weight. noheader, notable, nofooter, and coeflegend do not appear in the dialog box. See [R] reg3 postestimation for features available after estimation.

Explicit equation naming (eqname:) cannot be combined with multiple dependent variables in an equation specification.


Statistics > Endogenous covariates > Three-stage least squares


reg3 estimates a system of structural equations, where some equations contain endogenous variables among the explanatory variables. Estimation is via three-stage least squares (3SLS); see Zellner and Theil (1962). Typically, the endogenous explanatory variables are dependent variables from other equations in the system. reg3 supports iterated GLS estimation and linear constraints.

reg3 can also estimate systems of equations by seemingly unrelated regression estimation (SURE), multivariate regression (MVREG), and equation-by-equation ordinary least squares (OLS) or two-stage least squares (2SLS).


Under 3SLS or 2SLS estimation, a structural equation is defined as one of the equations specified in the system. A dependent variable will have its usual interpretation as the left-hand-side variable in an equation with an associated disturbance term. All dependent variables are explicitly taken to be endogenous to the system and are treated as correlated with the disturbances in the system's equations. Unless specified in an endog() option, all other variables in the system are treated as exogenous to the system and uncorrelated with the disturbances. The exogenous variables are taken to be instruments for the endogenous variables.


+-------+ ----+ Model +------------------------------------------------------------

ireg3 causes reg3 to iterate over the estimated disturbance covariance matrix and parameter estimates until the parameter estimates converge. Although the iteration is usually successful, there is no guarantee that it will converge to a stable point. Under SURE, this iteration converges to the maximum likelihood estimates.

constraints(constraints); see [R] estimation options.

+---------+ ----+ Model 2 +----------------------------------------------------------

exog(varlist) specifies additional exogenous variables that are included in none of the system equations. This can occur when the system contains identities that are not estimated. If implicitly exogenous variables from the equations are listed here, reg3 will just ignore the additional information. Specified variables will be added to the exogenous variables in the system and used in the first stage as instruments for the endogenous variables. By specifying dependent variables from the structural equations, you can use exog() to override their endogeneity.

endog(varlist) identifies variables in the system that are not dependent variables but are endogenous to the system. These variables must appear in the variable list of at least one equation in the system. Again the need for this identification often occurs when the system contains identities. For example, a variable that is the sum of an exogenous variable and a dependent variable may appear as an explanatory variable in some equations.

inst(varlist) specifies a full list of all exogenous variables and may not be used with the endog() or exog() options. It must contain a full list of variables to be used as instruments for the endogenous regressors. Like exog(), the list may contain variables not specified in the system of equations. This option can be used to achieve the same results as the endog() and exog() options, and the choice is a matter of convenience. Any variable not specified in the varlist of the inst() option is assumed to be endogenous to the system. As with exog(), including the dependent variables from the structural equations will override their endogeneity.

allexog indicates that all right-hand-side variables are to be treated as exogenous -- even if they appear as the dependent variable of another equation in the system. This option can be used to enforce a SURE or MVREG estimation even when some dependent variables appear as regressors.

noconstant; see [R] estimation options.

+-------------+ ----+ Est. method +------------------------------------------------------

3sls specifies the full 3SLS estimation of the system and is the default for reg3.

2sls causes reg3 to perform equation-by-equation 2SLS on the full system of equations. This option implies dfk, small, and corr(independent).

Cross-equation testing should not be performed after estimation with this option. With 2sls, no covariance is estimated between the parameters of the equations. For cross-equation testing, use 3sls.

ols causes reg3 to perform equation-by-equation OLS on the system -- even if dependent variables appear as regressors or the regressors differ for each equation; see [MV] mvreg. ols implies allexog, dfk, small, and corr(independent); nodfk and nosmall may be specified to override dfk and small.

The covariance of the coefficients between equations is not estimated under this option, and cross-equation tests should not be performed after estimation with ols. For cross-equation testing, use sure or 3sls (the default).

sure causes reg3 to perform a SURE of the system -- even if dependent variables from some equations appear as regressors in other equations; see [R] sureg. sure is a synonym for allexog.

mvreg is identical to sure, except that the disturbance covariance matrix is estimated with an OLS degrees-of-freedom adjustment -- the dfk option. If the regressors are identical for all equations, the parameter point estimates will be the standard MVREG results. If any of the regressors differ, the point estimates are those for SURE with an OLS degrees-of-freedom adjustment in computing the covariance matrix. nodfx and nosmall may be specified to override dfk and small.

corr(correlation) specifies the assumed form of the correlation structure of the equation disturbances and is rarely requested explicitly. For the family of models fit by reg3, the only two allowable correlation structures are unstructured and independent. The default is unstructured.

This option is used almost exclusively to estimate a system of equations by 2SLS or to perform OLS regression with reg3 on multiple equations. In these cases, the correlation is set to independent, forcing reg3 to treat the covariance matrix of equation disturbances as diagonal in estimating model parameters. Thus a set of two-stage coefficient estimates can be obtained if the system contains endogenous right-hand-side variables, or OLS regression can be imposed, even if the regressors differ across equations. Without imposing independent disturbances, reg3 would estimate the former by 3SLS and the latter by SURE.

Any tests performed after estimation with the independent option will treat coefficients in different equations as having no covariance; cross-equation tests should not be used after specifying corr(independent).

+---------+ ----+ df adj. +----------------------------------------------------------

small specifies that small-sample statistics be computed. It shifts the test statistics from chi-squared and z statistics to F statistics and t statistics. This option is intended primarily to support MVREG. Although the standard errors from each equation are computed using the degrees of freedom for the equation, the degrees of freedom for the t statistics are all taken to be those for the first equation. This approach poses no problem under MVREG because the regressors are the same across equations.

dfk specifies the use of an alternative divisor in computing the covariance matrix for the equation residuals. As an asymptotically justified estimator, reg3 by default uses the number of sample observations n as a divisor. When the dfk option is set, a small-sample adjustment is made, and the divisor is taken to be sqrt((n - k_i) * (n - k_j)), where k_i and k_j are the number of parameters in equations i and j, respectively.

dfk2 specifies the use of an alternative divisor in computing the covariance matrix for the equation errors. When the dfk2 option is set, the divisor is taken to be the mean of the residual degrees of freedom from the individual equations.

+-----------+ ----+ Reporting +--------------------------------------------------------

level(#); see [R] estimation options.

first requests that the first-stage regression results be displayed during estimation.

nocnsreport; see [R] estimation options.

display_options: noci, nopvalues, noomitted, vsquish, noemptycells, baselevels, allbaselevels, nofvlabel, fvwrap(#), fvwrapon(style), cformat(%fmt), pformat(%fmt), sformat(%fmt), and nolstretch; see [R] estimation options.

+--------------+ ----+ Optimization +-----------------------------------------------------

optimization_options control the iterative process that minimizes the sum of squared errors when ireg3 is specified. These options are seldom used.

iterate(#) specifies the maximum number of iterations. When the number of iterations equals #, the optimizer stops and presents the current results, even if the convergence tolerance has not been reached. The default value of iterate() is the current value of set maxiter, which is iterate(16000) if maxiter has not been changed.

trace adds to the iteration log a display of the current parameter vector.

nolog suppresses the display of the iteration log.

tolerance(#) specifies the tolerance for the coefficient vector. When the relative change in the coefficient vector from one iteration to the next is less than or equal to #, the optimization process is stopped. tolerance(1e-6) is the default.

The following options are available with reg3 but are not shown in the dialog box:

noheader suppresses display of the header reporting the estimation method and the table of equation summary statistics.

notable suppresses display of the coefficient table.

nofooter suppresses display of the footer reporting the list of endogenous and exogenous variables in the model.

coeflegend; see [R] estimation options.


--------------------------------------------------------------------------- Setup . webuse klein

Estimate system by three-stage least squares . reg3 (consump wagepriv wagegovt) (wagepriv consump govt capital1)

--------------------------------------------------------------------------- Setup . webuse supDem

Store equations in global macros . global demand "(qDemand: quantity price pcompete income)" . global supply "(qSupply: quantity price praw)"

Estimate system, specifying price as endogenous . reg3 $demand $supply, endog(price)

--------------------------------------------------------------------------- Setup . webuse klein

Store equations and variable lists in global macros . global conseqn "(consump profits profits1 wagetot)" . global inveqn "(invest profits profits1 capital1)" . global wageqn "(wagepriv totinc totinc1 year)" . global enlist "wagetot profits totinc" . global exlist "taxnetx wagegovt govt"

Estimate system, specifying lists of endogenous and exogenous variables; iterate until estimates converge . reg3 $conseqn $inveqn $wageqn, endog($enlist) exog($exlist) ireg3

Modify consumption equation . global conseqn "(consump profits profits1 wagepriv wagegovt)"

Constrain coefficients of wagepriv and wagegovt in consumption equation to be equal . constraint 1 [consump]wagepriv = [consump]wagegovt

Estimate system under constraint . reg3 $conseqn $inveqn $wageqn, endog($enlist) exog($exlist) constr(1) ireg3


Stored results

reg3 stores the following in e():

Scalars e(N) number of observations e(k) number of parameters e(k_eq) number of equations in e(b) e(mss_#) model sum of squares for equation # e(df_m#) model degrees of freedom for equation # e(rss_#) residual sum of squares for equation # e(df_r) residual degrees of freedom (small) e(r2_#) R-squared for equation # e(F_#) F statistic for equation # (small) e(rmse_#) root mean squared error for equation # e(dfk2_adj) divisor used with VCE when dfk2 specified e(ll) log likelihood e(chi2_#) chi-squared for equation # e(p_#) p-value for model test for equation # e(cons_#) 1 when equation # has a constant, 0 otherwise e(rank) rank of e(V) e(ic) number of iterations

Macros e(cmd) reg3 e(cmdline) command as typed e(depvar) names of dependent variables e(exog) names of exogenous variables e(endog) names of endogenous variables e(eqnames) names of equations e(corr) correlation structure e(wtype) weight type e(wexp) weight expression e(method) 3sls, 2sls, ols, sure, or mvreg e(small) small, if specified e(dfk) dfk, if specified e(properties) b V e(predict) program used to implement predict e(marginsok) predictions allowed by margins e(marginsnotok) predictions disallowed by margins e(marginsdefault) default predict() specification for margins e(asbalanced) factor variables fvset as asbalanced e(asobserved) factor variables fvset as asobserved

Matrices e(b) coefficient vector e(Cns) constraints matrix e(Sigma) Sigma hat matrix e(V) variance-covariance matrix of the estimators

Functions e(sample) marks estimation sample


Zellner, A., and H. Theil. 1962. Three stage least squares: Simultaneous estimate of simultaneous equations. Econometrica 29: 54-78.

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index