|
Note: This FAQ is for users of Stata 6, an older version of Stata.
It is not relevant for more recent versions.
Stata 6: How can I estimate a fixed-effects regression with
instrumental variables?
|
Title
|
|
Stata 6: Estimating fixed-effects regression with instrumental variables
|
|
Author
|
Vince Wiggins, StataCorp William Gould, StataCorp
|
|
Date
|
November 1999; minor revisions March 2001
|
Question
Is anyone aware of a routine in Stata to estimate instrumental variable
regression for the fixed-effects model? I cannot see that it is possible to
do it directly in Stata.
Answer
If we don’t have too many fixed-effects, that is to say the total
number of fixed-effects and other covariates is less than Stata's maximum
matrix size of 800, and then we can just use indicator variables for the
fixed effects. This approach is simple, direct, and always right.
For example, using the auto dataset and rep78 as the panel variable
(with missing values dropped) we could estimate a fixed-effects model of
mpg on weight and displacement. Let's assume
displacement is endogenous and we have gear_ratio and
headroom as instruments.
Solution 1
First, generate indicator variables named dr1-dr5, then use
ivreg to perform
the estimation.
. tab rep78, gen(dr)
output omitted
. ivreg mpg weight dr* (displ = gear_ratio headroom)
Instrumental variables (2SLS) regression
Source | SS df MS Number of obs = 69
-------------+------------------------------ F( 6, 62) = 20.25
Model | 1536.20001 6 256.033334 Prob > F = 0.0000
Residual | 804.002893 62 12.9677886 R-squared = 0.6564
-------------+------------------------------ Adj R-squared = 0.6232
Total | 2340.2029 68 34.4147485 Root MSE = 3.6011
------------------------------------------------------------------------------
mpg | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
displacement | -.0158046 .0281621 -0.56 0.577 -.0720999 .0404907
weight | -.0038296 .0030457 -1.26 0.213 -.0099178 .0022587
dr1 | .093266 2.932757 0.03 0.975 -5.769231 5.955763
dr2 | (dropped)
dr3 | -.094414 1.444669 -0.07 0.948 -2.982265 2.793437
dr4 | -.3131534 1.596628 -0.20 0.845 -3.504767 2.878461
dr5 | 2.217366 1.895149 1.17 0.246 -1.570984 6.005716
_cons | 35.79702 4.005494 8.94 0.000 27.79015 43.80389
------------------------------------------------------------------------------
Instrumented: displacement
Instruments: weight dr1 dr2 dr3 dr4 dr5 gear_ratio headroom
------------------------------------------------------------------------------
In real examples, you will probably have to increase matsize before
estimating the model. You will see
. ivreg ...
matsize too small; type -help matsize-
r(908)
You can set
matsize
to any number up to 800, assuming you have sufficient memory:
. set matsize 800
Solution 2
What if we have too many panels to estimate the model directly? In that case,
xtdata can be used
to transform the data to mean differences, and ivreg can then be used
to estimate the fixed-effects model on the transformed data; thus you will
not have to reset matsize at all. We can estimate the same model
as above by typing
. drop if rep78==.
(5 observations deleted)
. xtdata mpg weight displ gear_ratio headroom, i(rep78) fe clear
. ivreg mpg weight (displ = gear_ratio headroom)
Instrumental variables (2SLS) regression
Source | SS df MS Number of obs = 69
-------------+------------------------------ F( 2, 66) = 42.13
Model | 986.784228 2 493.392114 Prob > F = 0.0000
Residual | 804.002893 66 12.181862 R-squared = 0.5510
-------------+------------------------------ Adj R-squared = 0.5374
Total | 1790.78712 68 26.3351047 Root MSE = 3.4903
------------------------------------------------------------------------------
mpg | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
displacement | -.0158046 .0272954 -0.58 0.565 -.0703016 .0386924
weight | -.0038296 .002952 -1.30 0.199 -.0097233 .0020642
_cons | 36.03048 3.843726 9.37 0.000 28.35623 43.70472
------------------------------------------------------------------------------
Instrumented: displacement
Instruments: weight gear_ratio headroom
------------------------------------------------------------------------------
Comparing results
+----------------------------------------------------------+
| | Solution 1 | Solution 2 |
| mpg | Coef. Std. Err. | Coef. Std. Err. |
|---------+------------------------------------------------+
| displ | -.0158046 .0281621 | -.0158046 .0272954 |
| weight | -.0038296 .0030457 | -.0038296 .002952 |
| _cons | 35.79702 4.005494 | 36.03048 3.843726 |
+----------------------------------------------------------+
Comparing the estimates, we observe
- The coefficients for displ and weight are identical.
- The standard errors differ slightly.
- The intercepts differ.
The standard errors (SEs) differ by a scale factor and that is easily fixed.
The intercept differs because of an unimportant difference in
interpretation.
The SEs differ by a scale factor because our estimate of the residual
variance, RMSE, also differs between the solutions. In Solution 2, the SEs
are not adjusted for the fact that we estimated the fixed effects. We can
adjust them:
SE(soln. 1) = SE(soln. 2) * sqrt( e(dfr) / (e(dfr)-M+1) )
where M is the number of panels and M=5 in this case. For instance, the
standard error for _b[mpg] can be obtained by typing
. display _se[displ] * sqrt(e(dfr)/(e(dfr)-5+1))
.02816214
You do not have to adjust the standard errors—the reported SEs are
asymptotically equivalent. Solution 1’s SEs have been adjusted for
finite samples, and many researchers prefer that the adjustment be made.
With reasonable sample sizes, however, the adjustment will not amount to
much.
The intercept differs because of difference in interpretation.
In Solution 1, we did not take any care in computing the intercept and let
the indicator for rep78==2 drop out of the equation. Thus the intercept in
that equation is the intercept for the case rep78==2. In Solution 2,
xtdata mean-differenced the data and then added back in the overall
mean. Thus the reported intercept is essentially the overall intercept for
all the data; see the FAQ at
stata.com/support/faqs/statistics/intercept-in-fixed-effects-model
for a discussion.
Both solutions are equivalent.
|