Note: This FAQ is for Stata 10 and older versions of Stata. In Stata 11,
the **margins** command replaced **mfx**.

Title | predict() option unsuitable for marginal effects | |

Author | May Boggess, StataCorp |

Not every predict() option for every estimation command is suitable for calculating marginal effects with the command mfx, so mfx checks that the predict() option specified is suitable.

A marginal effect is the partial derivative of the prediction function f with respect to each covariate x. The command mfx calculates each of these derivatives numerically. This means that it uses the following approximation for each x_i:

df f(x_i+h) − f(x_i) ---- = -------------------- dx_i h

for an appropriate small change in x_i, h, holding all the other covariates and coefficients constant. Then, mfx evaluates this derivative at the mean of each of the covariates or, if you have used the at() option, at the values specified there.

This formula for the derivative is appropriate if the prediction function is a function only of the values of the covariates and their coefficients. For the partial derivative with respect to the covariate x, all other covariates and coefficients are held constant for the above calculation.

What is an example of a predict() option that depends on something else? Well, let’s look at a prediction function that depends on the value of the response: residuals.

. sysuse auto, clear(1978 Automobile Data). areg mpg weight gear, absorb(rep78)Number of obs = 69 F( 2, 62) = 41.64 Prob > F = 0.0000 R-squared = 0.6734 Adj R-squared = 0.6418 Root MSE = 3.5109 ------------------------------------------------------------------------------ mpg | Coef. Std. Err. t p>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- weight | -.0051031 .0009206 -5.54 0.000 -.0069433 -.003263 gear_ratio | .901478 1.565552 0.58 0.567 -2.228015 4.030971 _cons | 34.05889 7.056383 4.83 0.000 19.95338 48.1644 -------------+---------------------------------------------------------------- rep78 | F(4, 62) = 1.117 0.356 (5 categories). mfx, predict(residuals)predict() expression residuals unsuitable for marginal-effect calculation r(119);

To see how mfx came to that conclusion, we use the diagnostics(beta) option:

. mfx, predict(residuals) diagnostics(beta)Predict into observation 1 = 1.918876 Predict into last observation = -5.640898 Predict into all observations: mean = -1.207e-17 Predict into all observations: sd = 5.131774 predict() expression residuals unsuitable for marginal-effect calculation r(119);

To see if the prediction depends on something it should not, mfx uses the predict command and predicts into the first observation, after replacing the covariate values in that observation with the required values. It predicts into the last observation, after replacing the covariate values in that observation with the required values. It then checks if the two predictions are the same. It also predicts into all the observations, replacing the covariate values with the required values, and checks that the standard deviation of these predicted values is essentially zero. If it passes these tests, we conclude that the marginal effect will be calculated correctly.

For another example of an unsuitable predict() option, let’s look at one that depends on other observations used in the estimation command:

. webuse lowbirth, clear(Applied Logistic Regression, Hosmer & Lemeshow). clogit low lwt ptd, group(pairid)Iteration 0: log likelihood = -36.962156 Iteration 1: log likelihood = -34.637228 Iteration 2: log likelihood = -34.569847 Iteration 3: log likelihood = -34.569638 Conditional (fixed-effects) logistic regression Number of obs = 112 LR chi2(2) = 8.49 Prob > chi2 = 0.0143 Log likelihood = -34.569638 Pseudo R2 = 0.1094 ------------------------------------------------------------------------------ low | Coef. Std. Err. z p>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- lwt | -.0084936 .0067344 -1.26 0.207 -.0216928 .0047056 ptd | 1.270864 .570099 2.23 0.026 .153491 2.388238 ------------------------------------------------------------------------------. mfx, predict(pc1) diag(beta)Predict into observation 1 = .22979578 Predict into last observation = .34900767 Predict into all observations: mean = .5 Predict into all observations: sd = 0 predict() expression pc1 unsuitable for marginal-effect calculation r(119);

The prediction statistic pc1, following clogit, is the probability of a positive outcome, conditional on one positive outcome in the group. This means that the prediction depends on the group. We can see this more clearly if we calculate this probability by hand:

. webuse lowbirth(Applied Logistic Regression, Hosmer & Lemeshow). clogit low lwt ptd, group(pairid)Iteration 0: log likelihood = -36.962156 Iteration 1: log likelihood = -34.637228 Iteration 2: log likelihood = -34.569847 Iteration 3: log likelihood = -34.569638 Conditional (fixed-effects) logistic regression Number of obs = 112 LR chi2(2) = 8.49 Prob > chi2 = 0.0143 Log likelihood = -34.569638 Pseudo R2 = 0.1094 ------------------------------------------------------------------------------ low | Coef. Std. Err. z p>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- lwt | -.0084936 .0067344 -1.26 0.207 -.0216928 .0047056 ptd | 1.270864 .570099 2.23 0.026 .153491 2.388238 ------------------------------------------------------------------------------. predict xb, xb . gen top=exp(xb) . by pairid, sort: egen bot=total(exp(xb)) . gen mypc1=top/bot . predict pc1, pc1 . summarize pc1 mypc1Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- pc1 | 112 .5 .1882528 .0812115 .9187885 mypc1 | 112 .5 .1882527 .0812115 .9187885

The last two lines confirm that I came up with the same predicted value as Stata. This shows that the predicted value depends on the group—the variable bot (the denominator in the prediction) depends on the group.

Now, a persistent user may say, I will pick a group and get the marginal effect using pc1 for just that one group. Let’s give it a try:

. webuse lowbirth, clear(Applied Logistic Regression, Hosmer & Lemeshow). clogit low lwt ptd, group(pairid)Iteration 0: log likelihood = -36.962156 Iteration 1: log likelihood = -34.637228 Iteration 2: log likelihood = -34.569847 Iteration 3: log likelihood = -34.569638 Conditional (fixed-effects) logistic regression Number of obs = 112 LR chi2(2) = 8.49 Prob > chi2 = 0.0143 Log likelihood = -34.569638 Pseudo R2 = 0.1094 ------------------------------------------------------------------------------ low | Coef. Std. Err. z p>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- lwt | -.0084936 .0067344 -1.26 0.207 -.0216928 .0047056 ptd | 1.270864 .570099 2.23 0.026 .153491 2.388238 ------------------------------------------------------------------------------. keep if pairid==1(110 observations deleted). mfx, predict(pc1) diag(beta)Predict into observation 1 = .31435782 Predict into last observation = .68564218 Predict into all observations: mean = .5 Predict into all observations: sd = 0 predict() expression pc1 unsuitable for marginal-effect calculation r(119);

This is still no good. Why is that? Looking carefully at the formula for pc1 again, we notice that it doesn’t just depend on the number of observations in the group. It depends on all the values of the covariates in the observations in the group. So, if you predict into the first observation in the group, you get a different answer to predicting into the last observation in the group because in each case you wrote over one observation's values with the mean values of the covariates.

When we put all observations in the group equal to the mean values of the covariates, we predicted the same value, 0.5. Why can’t we do that? Look again at the formula for pc1. What happens then is that exp(xb) cancels out of the top and bottom leaving 1/n, which in our example is 1/2. This is a constant function so all the derivatives will be zero. So, no matter how you work it, it’s hopeless to get the marginal effects of pc1.

If you want to force mfx to compute a marginal
effect, despite failing the above test, you can do so by using the
**force** option. But remember that mfx is
operating under the assumption that it does not matter which observation it
predicts into, and since it has to predict somewhere, it is predicting into
observation 1 of the e(sample).

It is possible to obtain a marginal effect after clogit by using the predict() option pu0 as the next example shows:

. webuse lowbirth, clear(Applied Logistic Regression, Hosmer & Lemeshow). clogit low lwt ptd, group(pairid)Iteration 0: log likelihood = -34.641865 Iteration 1: log likelihood = -34.569694 Iteration 2: log likelihood = -34.569638 Iteration 3: log likelihood = -34.569638 Conditional (fixed-effects) logistic regression Number of obs = 112 LR chi2(2) = 8.49 Prob > chi2 = 0.0143 Log likelihood = -34.569638 Pseudo R2 = 0.1094 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- lwt | -.0084936 .0067344 -1.26 0.207 -.0216929 .0047056 ptd | 1.270864 .5701046 2.23 0.026 .1534801 2.388249 ------------------------------------------------------------------------------. mfx, predict(pu0)Marginal effects after clogit y = Pr(low|fixed effect is 0) (predict, pu0) = .31078384 ------------------------------------------------------------------------------ variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X ---------+-------------------------------------------------------------------- lwt | -.0018193 .00086 -2.12 0.034 -.0035 -.000139 127.17 ptd*| .294058 .14982 1.96 0.050 .000413 .587703 .223214 ------------------------------------------------------------------------------ (*) dy/dx is for discrete change of dummy variable from 0 to 1