FAQ: Instrumental variables for recursive systems with correlated disturbances

Home / Resources & support / FAQs / Instrumental variables for recursive systems with correlated disturbances

How do I estimate recursive systems using a subset of available instruments?

Title		Instrumental variables for triangular/recursive systems with correlated disturbances
Author		Gustavo Sanchez, StataCorp

Note: This model could also be fit with sem, using maximum likelihood instead of a two-step method.

You can find examples for recursive models fit with sem in the “Structural models: Dependencies between responese variables” section of [SEM] intro 5 — Tour of models.

Normally, we fit models requiring instrumental variables with ivregress, but sometimes we may want to perform the two-step computations for the instrumental variable estimator instead of using ivregress. For example, we may want to do this when a simultaneous equation system is recursive (sometimes called triangular), but there is some theoretical support for the hypothesis that the error terms are correlated across equations. The estimates from ivregress would still be consistent for such models, but we might prefer to exclude some unnecessary instruments.

Another approach that also leads to recursive systems is directed acyclical graphs (DAGs); see Pearl (2000) and Brito and Pearl (2002). In the figure below, the straight arrows correspond to direct causal links between each pair of variables, whereas the bidirected arc represents correlated errors in the data-generating process for X and Y. The equation for Y would require having Z as an instrument for X. We should not include W in the first-stage equation for X because, according to the DAG, there is not a causal link from W to X.

The correct variance–covariance matrix for the second stage of the instrumental variable estimator must take into account that one of the regressors has been predicted from a previous (first stage) regression. To obtain the adjusted standard errors, we must compute the residuals from the second-stage equation by using the parameter estimates obtained with regress but substituting the instrumented variable (the predicted values of the endogenous variable) for the original values of that variable. Greene (2012, chap. 8) explains the approach and provides the formula for the estimated asymptotic covariance matrix.

Warning: Instrumental variables are commonly used to fit simultaneous systems models. What follows is not appropriate for such models. For a discussion, see Must I use all of my exogenous variables as instruments when estimating instrumental variables regression?

Let’s assume we are interested in the parameter estimates of the following recursive model:

trunk = delta₀ + delta₁ * headroom + epsilon

price = Beta₀ + Beta₁ * trunk + Beta₂ * displacement + mu

where trunk is endogenous. In Stata, you can fit the second equation of this model by using ivregress as follows:

. sysuse auto
(1978 automobile data)

. ivregress 2sls price displacement (trunk=headroom), small

Instrumental variables 2SLS regression


      Source         SS       df       MS         Number of obs   =         74
         F(  2,    71)   =      11.29
       Model     108641939     2  54320969.4       Prob > F        =     0.0001
    Residual     526423457    71  7414414.89       R-squared       =     0.1711
         Adj R-squared   =     0.1477
       Total     635065396    73  8699525.97       Root MSE        =     2722.9




       price   Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
   
       trunk    -222.3396   175.7292    -1.27   0.210    -572.7338    128.0545
displacement     22.19871   6.071088     3.66   0.000      10.0933    34.30411
       _cons     4844.184   1620.835     2.99   0.004     1612.331    8076.037

Endogenous: trunk
Exogenous:  displacement headroom

We used the small option to obtain small-sample statistics because our dataset has only 74 observations. The instruments reported at the bottom of the output correspond to the two exogenous variables in the system. If you need to fit the model with headroom as the only instrument, you can use regress twice and compute the standard errors accounting for the inclusion of a predicted regressor through the following five steps.

Step 1

First, fit the model for the endogenous variable as a function of headroom:

. regress trunk headroom


      Source         SS           df       MS      Number of obs   =        74
      F(1, 72)        =     56.17
       Model    585.347842         1  585.347842    Prob > F        =    0.0000
    Residual     750.27378        72  10.4204692    R-squared       =    0.4383
      Adj R-squared   =    0.4305
       Total    1335.62162        73  18.2961866    Root MSE        =    3.2281




       trunk   Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
   
    headroom     3.347171   .4465957     7.49   0.000     2.456899    4.237443
       _cons      3.73786   1.388442     2.69   0.009      .970052    6.505667

Step 2

Next, predict trunk and fit the second-stage regression, substituting trunk with its predicted values:

. predict double trunk_hat
(option xb assumed; fitted values)

. regress price trunk_hat displacement


      Source         SS           df       MS      Number of obs   =        74
      F(2, 71)        =     12.71
       Model     167440536         2  83720268.2    Prob > F        =    0.0000
    Residual     467624860        71  6586265.63    R-squared       =    0.2637
      Adj R-squared   =    0.2429
       Total     635065396        73  8699525.97    Root MSE        =    2566.4




       price   Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
   
   trunk_hat    -161.7683    120.504    -1.34   0.184    -402.0465    78.50998
displacement     18.26261   3.715596     4.92   0.000     10.85392     25.6713
       _cons       4787.5   1490.392     3.21   0.002     1815.743    7759.256

The point estimates for this regression correspond to the instrumental variable estimation. However, the standard errors do not take into account that trunk was predicted in a previous regression.

Step 3

To compute the correct standard errors, obtain the estimated variance of the residuals, using trunk instead of trunk_hat to get the corresponding residuals:

. local dof=e(df_r) /* Where dof corresponds to the degrees
                        of freedom of the residuals from the
                        previous regression */

. replace trunk_hat=trunk
(74 real changes made)

. predict double e_hat,residuals

. generate double ee_hat=e_hat^2

. summarize ee_hat,meanonly

. scalar sigsq_hat_pr= r(sum)/`dof'

Step 4

Get the inverse of the instrumented regressors, W ' W, by removing the mean squared error from the VCE of the second stage.

. matrix V=e(V)

. matrix xx_1=e(V)/(e(rmse)^2)

where e(V) and e(rmse) are the covariance matrix and the root mean squared error from the regression in step 2.

Step 5

Finally, compute the covariance matrix of the IV estimator, and post and display the results:

. matrix V=sigsq_hat_pr*xx_1

. matrix b=e(b) /* Where e(b) contains the point estimates
                   from the second stage regression */

. ereturn post b V, dof(`dof')

. ereturn display



               Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
   
   trunk_hat    -161.7683   125.6518    -1.29   0.202    -412.3109    88.77442
displacement     18.26261   3.874322     4.71   0.000     10.53743    25.98779
       _cons       4787.5    1554.06     3.08   0.003     1688.793    7886.207

For a different perspective on the same problem, see Must I use all of my exogenous variables as instruments when estimating instrumental variables regression?

References

Brito, C., and J. Pearl. 2002.: Generalized instrumental variables. In Uncertainty in Artificial Intelligence, Proceedings of the Eighteenth Conference, 85–93. San Francisco: Morgan Kaufmann.

Greene, W. H. 2018.: Econometric Analysis. 8th ed. Upper Saddle River, NJ: Prentice Hall.

Pearl, J. 2000.: Causality. Cambridge: Cambridge University Press.

How do I estimate recursive systems using a subset of available instruments?

Step 1

Step 2

Step 3

Step 4

Step 5

References

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies

Source	SS df MS	Number of obs = 74
		F( 2, 71) = 11.29
Model	108641939 2 54320969.4	Prob > F = 0.0001
Residual	526423457 71 7414414.89	R-squared = 0.1711
		Adj R-squared = 0.1477
Total	635065396 73 8699525.97	Root MSE = 2722.9


price		Coefficient Std. err. t P>\|t\| [95% conf. interval]

trunk		-222.3396 175.7292 -1.27 0.210 -572.7338 128.0545
displacement		22.19871 6.071088 3.66 0.000 10.0933 34.30411
_cons		4844.184 1620.835 2.99 0.004 1612.331 8076.037


trunk		Coefficient Std. err. t P>\|t\| [95% conf. interval]

headroom		3.347171 .4465957 7.49 0.000 2.456899 4.237443
_cons		3.73786 1.388442 2.69 0.009 .970052 6.505667


price		Coefficient Std. err. t P>\|t\| [95% conf. interval]

trunk_hat		-161.7683 120.504 -1.34 0.184 -402.0465 78.50998
displacement		18.26261 3.715596 4.92 0.000 10.85392 25.6713
_cons		4787.5 1490.392 3.21 0.002 1815.743 7759.256


		Coefficient Std. err. t P>\|t\| [95% conf. interval]

trunk_hat		-161.7683 125.6518 -1.29 0.202 -412.3109 88.77442
displacement		18.26261 3.874322 4.71 0.000 10.53743 25.98779
_cons		4787.5 1554.06 3.08 0.003 1688.793 7886.207

Stata/MP4 Annual License (download)

How do I estimate recursive systems using a subset of available instruments?

Step 1

Step 2

Step 3

Step 4

Step 5

References

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies