|
Note: This FAQ is for Stata 10 and older versions of Stata. In Stata 11,
the margins command replaced mfx.
When I run mfx, I am getting the error message “predict() option
unsuitable for marginal effects”. What does that mean?
|
Title
|
|
predict() option unsuitable for marginal effects
|
|
Author
|
May Boggess, StataCorp
|
|
Date
|
April 2004; updated February 2005
|
Not every predict() option for every
estimation command is suitable for calculating marginal effects with the
command mfx, so mfx checks
that the predict() option specified is
suitable.
A marginal effect is the partial derivative of the prediction function f
with respect to each covariate x. The command mfx
calculates each of these derivatives numerically. This means that it uses
the following approximation for each x_i:
df f(x_i+h) − f(x_i)
---- = --------------------
dx_i h
for an appropriate small change in x_i,
h, holding all the other covariates and
coefficients constant. Then, mfx evaluates this
derivative at the mean of each of the covariates or, if you have used the
at() option, at the values specified there.
This formula for the derivative is appropriate if the prediction function is
a function only of the values of the covariates and their coefficients. For
the partial derivative with respect to the covariate x,
all other covariates and coefficients are held constant for the above
calculation.
What is an example of a predict() option that
depends on something else? Well, let’s look at a prediction function
that depends on the value of the response: residuals.
. sysuse auto, clear
(1978 Automobile Data)
. areg mpg weight gear, absorb(rep78)
Number of obs = 69
F( 2, 62) = 41.64
Prob > F = 0.0000
R-squared = 0.6734
Adj R-squared = 0.6418
Root MSE = 3.5109
------------------------------------------------------------------------------
mpg | Coef. Std. Err. t p>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | -.0051031 .0009206 -5.54 0.000 -.0069433 -.003263
gear_ratio | .901478 1.565552 0.58 0.567 -2.228015 4.030971
_cons | 34.05889 7.056383 4.83 0.000 19.95338 48.1644
-------------+----------------------------------------------------------------
rep78 | F(4, 62) = 1.117 0.356 (5 categories)
. mfx, predict(residuals)
predict() expression residuals unsuitable for marginal-effect calculation
r(119);
To see how mfx came to that conclusion, we use the
diagnostics(beta) option:
. mfx, predict(residuals) diagnostics(beta)
Predict into observation 1 = 1.918876
Predict into last observation = -5.640898
Predict into all observations: mean = -1.207e-17
Predict into all observations: sd = 5.131774
predict() expression residuals unsuitable for marginal-effect calculation
r(119);
To see if the prediction depends on something it should not,
mfx uses the predict
command and predicts into the first observation, after replacing the
covariate values in that observation with the required values. It predicts
into the last observation, after replacing the covariate values in that
observation with the required values. It then checks if the two predictions
are the same. It also predicts into all the observations, replacing the
covariate values with the required values, and checks that the standard
deviation of these predicted values is essentially zero. If it passes these
tests, we conclude that the marginal effect will be calculated correctly.
For another example of an unsuitable
predict() option, let’s look at one
that depends on other observations used in the estimation command:
. webuse lowbirth, clear
(Applied Logistic Regression, Hosmer & Lemeshow)
. clogit low lwt ptd, group(pairid)
Iteration 0: log likelihood = -36.962156
Iteration 1: log likelihood = -34.637228
Iteration 2: log likelihood = -34.569847
Iteration 3: log likelihood = -34.569638
Conditional (fixed-effects) logistic regression Number of obs = 112
LR chi2(2) = 8.49
Prob > chi2 = 0.0143
Log likelihood = -34.569638 Pseudo R2 = 0.1094
------------------------------------------------------------------------------
low | Coef. Std. Err. z p>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwt | -.0084936 .0067344 -1.26 0.207 -.0216928 .0047056
ptd | 1.270864 .570099 2.23 0.026 .153491 2.388238
------------------------------------------------------------------------------
. mfx, predict(pc1) diag(beta)
Predict into observation 1 = .22979578
Predict into last observation = .34900767
Predict into all observations: mean = .5
Predict into all observations: sd = 0
predict() expression pc1 unsuitable for marginal-effect calculation
r(119);
The prediction statistic pc1,
following clogit, is the probability of a positive
outcome, conditional on one positive outcome in the group. This means that
the prediction depends on the group. We can see this more clearly if we
calculate this probability by hand:
. webuse lowbirth
(Applied Logistic Regression, Hosmer & Lemeshow)
. clogit low lwt ptd, group(pairid)
Iteration 0: log likelihood = -36.962156
Iteration 1: log likelihood = -34.637228
Iteration 2: log likelihood = -34.569847
Iteration 3: log likelihood = -34.569638
Conditional (fixed-effects) logistic regression Number of obs = 112
LR chi2(2) = 8.49
Prob > chi2 = 0.0143
Log likelihood = -34.569638 Pseudo R2 = 0.1094
------------------------------------------------------------------------------
low | Coef. Std. Err. z p>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwt | -.0084936 .0067344 -1.26 0.207 -.0216928 .0047056
ptd | 1.270864 .570099 2.23 0.026 .153491 2.388238
------------------------------------------------------------------------------
. predict xb, xb
. gen top=exp(xb)
. by pairid, sort: egen bot=total(exp(xb))
. gen mypc1=top/bot
. predict pc1, pc1
. summarize pc1 mypc1
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
pc1 | 112 .5 .1882528 .0812115 .9187885
mypc1 | 112 .5 .1882527 .0812115 .9187885
The last two lines confirm that I came up with the same predicted value as
Stata. This shows that the predicted value depends on the group—the
variable bot (the denominator in the
prediction) depends on the group.
Now, a persistent user may say, I will pick a group and get the marginal
effect using pc1 for just that one group.
Let’s give it a try:
. webuse lowbirth, clear
(Applied Logistic Regression, Hosmer & Lemeshow)
. clogit low lwt ptd, group(pairid)
Iteration 0: log likelihood = -36.962156
Iteration 1: log likelihood = -34.637228
Iteration 2: log likelihood = -34.569847
Iteration 3: log likelihood = -34.569638
Conditional (fixed-effects) logistic regression Number of obs = 112
LR chi2(2) = 8.49
Prob > chi2 = 0.0143
Log likelihood = -34.569638 Pseudo R2 = 0.1094
------------------------------------------------------------------------------
low | Coef. Std. Err. z p>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwt | -.0084936 .0067344 -1.26 0.207 -.0216928 .0047056
ptd | 1.270864 .570099 2.23 0.026 .153491 2.388238
------------------------------------------------------------------------------
. keep if pairid==1
(110 observations deleted)
. mfx, predict(pc1) diag(beta)
Predict into observation 1 = .31435782
Predict into last observation = .68564218
Predict into all observations: mean = .5
Predict into all observations: sd = 0
predict() expression pc1 unsuitable for marginal-effect calculation
r(119);
This is still no good. Why is that? Looking carefully at the formula for
pc1 again, we notice that it doesn’t
just depend on the number of observations in the group. It depends on all
the values of the covariates in the observations in the group. So, if you
predict into the first observation in the group, you get a different answer
to predicting into the last observation in the group because in each case
you wrote over one observation's values with the mean values of the
covariates.
When we put all observations in the group equal to the mean values of the
covariates, we predicted the same value, 0.5. Why can’t we do that? Look
again at the formula for pc1. What happens
then is that exp(xb)
cancels out of the top and bottom leaving 1/n,
which in our example is 1/2. This is a constant function so all the
derivatives will be zero. So, no matter how you work it, it’s
hopeless to get the marginal effects of
pc1.
If you want to force mfx to compute a marginal
effect, despite failing the above test, you can do so by using the
force option. But remember that mfx is
operating under the assumption that it does not matter which observation it
predicts into, and since it has to predict somewhere, it is predicting into
observation 1 of the e(sample).
It is possible to obtain a marginal effect after
clogit by using the
predict() option
pu0 as the next example shows:
. webuse lowbirth, clear
(Applied Logistic Regression, Hosmer & Lemeshow)
. clogit low lwt ptd, group(pairid)
Iteration 0: log likelihood = -34.641865
Iteration 1: log likelihood = -34.569694
Iteration 2: log likelihood = -34.569638
Iteration 3: log likelihood = -34.569638
Conditional (fixed-effects) logistic regression Number of obs = 112
LR chi2(2) = 8.49
Prob > chi2 = 0.0143
Log likelihood = -34.569638 Pseudo R2 = 0.1094
------------------------------------------------------------------------------
low | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwt | -.0084936 .0067344 -1.26 0.207 -.0216929 .0047056
ptd | 1.270864 .5701046 2.23 0.026 .1534801 2.388249
------------------------------------------------------------------------------
. mfx, predict(pu0)
Marginal effects after clogit
y = Pr(low|fixed effect is 0) (predict, pu0)
= .31078384
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
lwt | -.0018193 .00086 -2.12 0.034 -.0035 -.000139 127.17
ptd*| .294058 .14982 1.96 0.050 .000413 .587703 .223214
------------------------------------------------------------------------------
(*) dy/dx is for discrete change of dummy variable from 0 to 1
|