|
Note: This FAQ is for Stata 10 and older versions of Stata. In Stata 11,
the margins command replaced mfx.
When I run mfx, I am getting the warning message “warning: predict()
expression unsuitable for standard-error calculation; option nose
imposed.” What does that mean?
| Title |
|
Obtaining marginal effects without standard errors |
| Author |
May Boggess, StataCorp
|
| Date |
April 2004; minor revisions July 2007
|
Not every
predict
option for every estimation command is suitable for calculating the standard
error of the marginal effects, so
mfx checks if the
predict option specified is suitable.
A marginal effect is the partial derivative of the prediction function
f with respect to each covariate
x. The mfx command
calculates each of these derivatives numerically. This means that it uses
the following approximation for each x_i:
df f(x_i+h) − f(x_i)
---- = --------------------
dx_i h
for an appropriate small change in x_i,
h, holding all of the other covariates
and coefficients constant. mfx evaluates this
derivative at the mean of each of the covariates or, if you have used the
at() option, at the values specified there.
The standard error of the marginal effect is computed by the delta method:
dM_i ' dM_i
Var(M_i) = -------- Var(B) ------
db db
where M_i is the marginal effect of the
ith independent variable
x_i, and the vector
dM_i/db
has for its jth component, the partial derivative of
M_i with respect to the the coefficient of
the jth independent variable, b_j. This is
because the marginal effect, evaluated at a point, is a function of the
coefficients b_j only.
To calculate
dM_i/db_j,
mfx uses the usual approximation:
dM_i f(x_i, b_j+hb) − f(x_i, b_j)
------ = ---------------------------------
db_j hb
where hb is a small change in
b_j. This is a partial derivative, so this
is done holding all other coefficients constant (at the value estimated by
the estimation command), and all covariates are held constant at the value
specified in the mfx command.
Problems arise in computing
dM_i/db_j
when the prediction function f depends on the coefficients in a less than
straightforward manner. Let’s look at an example:
. use http://www.stata-press.com/data/r10/hsng2, clear
(1980 Census housing data)
. ivregress 2sls rent pcturban (hsngval = faminc reg2-reg4)
Instrumental variables (2SLS) regression Number of obs = 50
Wald chi2(2) = 90.76
Prob > chi2 = 0.0000
R-squared = 0.5989
Root MSE = 22.166
------------------------------------------------------------------------------
rent | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
hsngval | .0022398 .0003284 6.82 0.000 .0015961 .0028836
pcturban | .081516 .2987652 0.27 0.785 -.504053 .667085
_cons | 120.7065 15.22839 7.93 0.000 90.85942 150.5536
------------------------------------------------------------------------------
Instrumented: hsngval
Instruments: pcturban faminc reg2 reg3 reg4
. mfx, predict(pr(200,300)) diagnostics(vce)
Check prediction function does not depend on dependent variables,
covariance matrix, or stored scalars.
dfdx:
.00001126 .00040971
dfdx, after resetting dependent variables, covariance matrix, and stored scalars:
. .
Relative difference = .
warning: predict() expression pr(200,300) unsuitable for standard-error calculation;
option nose imposed
Marginal effects after ivregress
y = Pr(200<rent<300) (predict, pr(200,300))
= .9399585
-------------------------------------------------------------------------------
variable | dy/dx X
---------------------------------+---------------------------------------------
hsngval | .0000113 48484
pcturban | .0004097 66.9491
-------------------------------------------------------------------------------
The diagnostics(vce) option shows us how
mfx came to the conclusion that standard errors are
not appropriate.
mfx checks this by setting the covariance matrix to
the identity matrix, setting all the dependent variables to zero, and
blanking out various scalars stored in the estimates. Then it recalculates
the marginal effect. If it gets the same results it got the first time, it
concludes that the prediction function did not depend on any of those
quantities. But, if the results changed, then mfx
concludes there is a problem.
In our example, the results certainly changed. What happened? Well, the
function pr(200,300) depends on
e(rmse), which is stored as a scalar,
and thus has been blanked out. That’s why we got an empty answer the
second time around. And it really is a problem for the prediction function
to depend on e(rmse), because
e(rmse) depends on the coefficients and when
mfx is calculating the derivative of
f with respect to a coefficient, it is assuming
that f depends on the coefficients only through the
coefficient matrix e(b).
What if the two matrices were the same but contained empty values, thus
making the relative difference nonzero? It is probably a good idea to figure
out why those marginal effects were coming up empty. Often it is because you
are trying to evaluate the marginal effect at a point where the values of
the prediction function are not very reasonable. So I would use the
at() option (as well as
nose to save some time) and calculate the
marginal effects at points nearby where you were trying to calculate it.
Sometimes a small change in the point will make a big difference. As a last
resort, you can use the varlist option on
mfx so the marginal effects that were empty will
not be calculated, and it will pass the test.
What if the difference between the two was very, very small, say
10−10? This is not small enough to pass the test, but
surely this difference is minor. That may well be true. I would try the
same approach as I did in the previous paragraph, using the
at() option to change the point, and see if
that makes a difference. Here is an example like that:
. use http://www.stata-press.com/data/r10/abdata, clear
. set matsize 800
. xtabond n l(0/1).w l(0/2).(k ys) yr1980-yr1984 year, lags(2) noconstant
Arellano-Bond dynamic panel-data estimation Number of obs = 611
Group variable: id Number of groups = 140
Time variable: year
Obs per group: min = 4
avg = 4.364286
max = 6
Number of instruments = 40 Wald chi2(15) = 1627.13
Prob > chi2 = 0.0000
One-step results
------------------------------------------------------------------------------
n | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
n |
L1. | .7080866 .1455545 4.86 0.000 .4228051 .9933681
L2. | -.0886343 .0448479 -1.98 0.048 -.1765346 -.000734
w |
--. | -.605526 .0661129 -9.16 0.000 -.735105 -.4759471
L1. | .4096717 .1081258 3.79 0.000 .1977491 .6215943
k |
--. | .3556407 .0373536 9.52 0.000 .2824289 .4288525
L1. | -.0599314 .0565918 -1.06 0.290 -.1708493 .0509865
L2. | -.0211709 .0417927 -0.51 0.612 -.1030831 .0607412
ys |
--. | .6264699 .1348009 4.65 0.000 .3622651 .8906748
L1. | -.7231751 .1844696 -3.92 0.000 -1.084729 -.3616214
L2. | .1179079 .1440154 0.82 0.413 -.1643572 .400173
yr1980 | .0113066 .0140625 0.80 0.421 -.0162554 .0388686
yr1981 | -.0212183 .0206559 -1.03 0.304 -.0617031 .0192665
yr1982 | -.034952 .022122 -1.58 0.114 -.0783103 .0084063
yr1983 | -.0287094 .0251536 -1.14 0.254 -.0780096 .0205909
yr1984 | -.014862 .0284594 -0.52 0.602 -.0706414 .0409174
------------------------------------------------------------------------------
Instruments for differenced equation
GMM-type: L(2/.).n
Standard: D.w LD.w D.k LD.k L2D.k D.ys LD.ys L2D.ys D.yr1980 D.yr1981 D.yr1982
D.yr1983 D.yr1984
. mfx, at(mean L.n=-0.06) diag(vce)
Check prediction function does not depend on dependent variables,
covariance matrix, or stored scalars.
dfdx:
.70808656 -.08863433 -.60552603 .40967169 .35564067 -.0599314 -.02117091
.62646995 -.7231751 .11790789 .01130656 -.02121832 -.03495199 -.02870935
-.01486203
dfdx, after resetting dependent variables, covariance matrix, and stored scalars:
.70808656 -.08863433 -.60552603 .40967169 .35564067 -.0599314 -.02117091
.62646995 -.7231751 .11790789 .01130656 -.02121832 -.03495199 -.02870935
-.01486203
Relative difference = 0
Marginal effects after xtabond
y = Linear prediction (predict)
= -.84471245
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
L.n | .7080866 .14555 4.86 0.000 .422805 .993368 -.06
L2.n | -.0886343 .04485 -1.98 0.048 -.176535 -.000734 1.09584
w | -.605526 .06611 -9.16 0.000 -.735105 -.475947 3.14957
L.w | .4096717 .10813 3.79 0.000 .197749 .621594 3.12676
k | .3556407 .03735 9.52 0.000 .282429 .428852 -.502119
L.k | -.0599314 .05659 -1.06 0.290 -.170849 .050987 -.429181
L2.k | -.0211709 .04179 -0.51 0.612 -.103083 .060741 -.391757
ys | .6264699 .1348 4.65 0.000 .362265 .890675 4.59385
L.ys | -.7231751 .18447 -3.92 0.000 -1.08473 -.361621 4.62901
L2.ys | .1179079 .14402 0.82 0.413 -.164357 .400173 4.66607
yr1980*| .0113066 .01406 0.80 0.421 -.016255 .038869 .225859
yr1981*| -.0212183 .02066 -1.03 0.304 -.061703 .019266 .229133
yr1982*| -.034952 .02212 -1.58 0.114 -.07831 .008406 .229133
yr1983*| -.0287094 .02515 -1.14 0.254 -.07801 .020591 .12766
yr1984*| -.014862 .02846 -0.52 0.602 -.070641 .040917 .057283
------------------------------------------------------------------------------
(*) dy/dx is for discrete change of dummy variable from 0 to 1
If you want to force mfx to compute the standard
error of the marginal effect, despite failing the above test, you can do so
by using the force option. If you
can’t find a better point, but the difference was very small for each
point you tried, and you convinced yourself by examining the formula for the
prediction function that it shouldn't depend on anything but the covariate
values and the coefficient matrix e(b), then
you may be confident enough to use force.
But remember, if diag(vce) shows a large relative difference (say,
bigger than 10−2 for example) the standard errors given by
using force will probably be wrong because
mfx cannot take into account dependency on
coefficients that is not through e(b).
|