Title | Marginal effects and the nodrop option | |

Author | May Boggess, StataCorp | |

Date | April 2004 |

A marginal effect is a derivative of a function. By using the

In calculating a marginal effect, **mfx** must use the **predict**
command to obtain the values of the prediction function. The **predict**
command works by creating a new variable and putting the predicted value for
each observation into that new variable. So, the prediction goes into the
dataset.

How many observations does **mfx** need to predict into in order to
function properly? Well, most of the time, the answer is one, and to save
computation time, **mfx** temporarily drops all the observations except the
first one in the **e(sample)**, if it is safe to do so.

How do you know if **mfx** has concluded that it is safe to work with only
one observation? You can use the **diagnostics(drop)** option. Let's look
at an example:

. webuse friedman2, clear . keep if tin( ,1981q4) (67 observations deleted) . arima consump m2, ar(1) ma(1) nolog ARIMA regression Sample: 1959q1 to 1981q4 Number of obs = 92 Wald chi2(3) = 4394.80 Log likelihood = -340.5077 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ | OPG consump | Coef. Std. Err. z p>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- consump | m2 | 1.122029 .0363563 30.86 0.000 1.050772 1.193286 _cons | -36.09872 56.56703 -0.64 0.523 -146.9681 74.77062 -------------+---------------------------------------------------------------- ARMA | ar | L1 | .9348486 .0411323 22.73 0.000 .8542308 1.015467 ma | L1 | .3090592 .0885883 3.49 0.000 .1354293 .4826891 -------------+---------------------------------------------------------------- /sigma | 9.655308 .5635157 17.13 0.000 8.550837 10.75978 ------------------------------------------------------------------------------ . mfx, predict(xb structural) diagnostics(drop) Predict into observation 1 = 828.33238 Predict error after drop. note: nodrop option enforced. All e(sample) observations kept: N = 92 Marginal effects after arima y = xb prediction, structural one-step (predict, xb structural) = 828.33239 ------------------------------------------------------------------------------ variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X ---------+-------------------------------------------------------------------- m2 | 1.122029 .03636 30.86 0.000 1.05077 1.19329 770.418 ------------------------------------------------------------------------------

We see that we get a predict error after keeping only one observation. Let's
use **predict** by itself and see if the same thing happens:

. webuse friedman2, clear . keep if tin( ,1981q4) (67 observations deleted) . quietly arima consump m2, ar(1) ma(1) nolog . keep if e(sample) (52 observations deleted) . keep in 1 (91 observations deleted) . predict xb, xb structural Obs. nos. out of range r(198);

If we rerun **predict** with **set trace on**, we see that it is
referring to the other observations, which are now not there. So, in this
example, it is certainly safest to keep all observations in memory during
**mfx**.
A marginal effect is a derivative of a function. By using the **predict**
option of **mfx**, we specify the function for which we would like marginal
effects. If none was specified, the default prediction option for the
preceding estimation command is used.

In calculating a marginal effect, **mfx** must use the **predict**
command to obtain the values of the prediction function. The **predict**
command works by creating a new variable and putting the predicted value for
each observation into that new variable. So, the prediction goes into the
dataset.

How many observations does **mfx** need to predict into in order to
function properly? Well, most of the time, the answer is one, and to save
computation time, **mfx** temporarily drops all the observations except the
first one in the **e(sample)**, if it is safe to do so.

How do you know if **mfx** has concluded that it is safe to work with only
one observation? You can use the **diagnostics(drop)** option. Let's look
at an example:

. webuse friedman2, clear . keep if tin( ,1981q4) (67 observations deleted) . arima consump m2, ar(1) ma(1) nolog ARIMA regression Sample: 1959q1 to 1981q4 Number of obs = 92 Wald chi2(3) = 4394.80 Log likelihood = -340.5077 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ | OPG consump | Coef. Std. Err. z p>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- consump | m2 | 1.122029 .0363563 30.86 0.000 1.050772 1.193286 _cons | -36.09872 56.56703 -0.64 0.523 -146.9681 74.77062 -------------+---------------------------------------------------------------- ARMA | ar | L1 | .9348486 .0411323 22.73 0.000 .8542308 1.015467 ma | L1 | .3090592 .0885883 3.49 0.000 .1354293 .4826891 -------------+---------------------------------------------------------------- /sigma | 9.655308 .5635157 17.13 0.000 8.550837 10.75978 ------------------------------------------------------------------------------ . mfx, predict(xb structural) diagnostics(drop) Predict into observation 1 = 828.33238 Predict error after drop. note: nodrop option enforced. All e(sample) observations kept: N = 92 Marginal effects after arima y = xb prediction, structural one-step (predict, xb structural) = 828.33239 ------------------------------------------------------------------------------ variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X ---------+-------------------------------------------------------------------- m2 | 1.122029 .03636 30.86 0.000 1.05077 1.19329 770.418 ------------------------------------------------------------------------------

We see that we get a predict error after keeping only one observation. Let's
use **predict** by itself and see if the same thing happens:

. webuse friedman2, clear . keep if tin( ,1981q4) (67 observations deleted) . quietly arima consump m2, ar(1) ma(1) nolog . keep if e(sample) (52 observations deleted) . keep in 1 (91 observations deleted) . predict xb, xb structural Obs. nos. out of range r(198);

If we rerun **predict** with **set trace on**, we see that it is
referring to the other observations, which are now not there. So, in this
example, it is certainly safest to keep all observations in memory during
**mfx**.

We use the **nodrop** option to specify that **mfx** keep all the
observations in memory during its calculations. Let's see how it works:

. webuse sysdsn3, clear (Health insurance data) . mlogit insure age male nonwhite site2 site3, nolog Multinomial logistic regression Number of obs = 615 LR chi2(10) = 42.99 Prob > chi2 = 0.0000 Log likelihood = -534.36165 Pseudo R2 = 0.0387 ------------------------------------------------------------------------------ insure | Coef. Std. Err. z p>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- Prepaid | age | -.011745 .0061946 -1.90 0.058 -.0238862 .0003962 male | .5616934 .2027465 2.77 0.006 .1643175 .9590693 nonwhite | .9747768 .2363213 4.12 0.000 .5115955 1.437958 site2 | .1130359 .2101903 0.54 0.591 -.2989296 .5250013 site3 | -.5879879 .2279351 -2.58 0.010 -1.034733 -.1412433 _cons | .2697127 .3284422 0.82 0.412 -.3740222 .9134476 -------------+---------------------------------------------------------------- Uninsure | age | -.0077961 .0114418 -0.68 0.496 -.0302217 .0146294 male | .4518496 .3674867 1.23 0.219 -.268411 1.17211 nonwhite | .2170589 .4256361 0.51 0.610 -.6171725 1.05129 site2 | -1.211563 .4705127 -2.57 0.010 -2.133751 -.2893747 site3 | -.2078123 .3662926 -0.57 0.570 -.9257327 .510108 _cons | -1.286943 .5923219 -2.17 0.030 -2.447872 -.1260135 ------------------------------------------------------------------------------ (Outcome insure==Indemnity is the comparison group) . mfx, predict(p outcome(1)) diagnostics(drop) Predict into observation 1 = .48179251 Predict into obs 1 after drop = .48179251 Keep first e(sample) observation only. Marginal effects after mlogit y = Pr(insure==1) (predict, p outcome(1)) = .48179251 ------------------------------------------------------------------------------ variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X ---------+-------------------------------------------------------------------- age | .0028073 .00148 1.90 0.058 -.000096 .005711 44.4683 male*| -.1347111 .04683 -2.88 0.004 -.226494 -.042929 .250407 nonwhite*| -.2138472 .05074 -4.21 0.000 -.313297 -.114397 .196748 site2*| .0096603 .05082 0.19 0.849 -.089942 .109263 .370732 site3*| .1333108 .05294 2.52 0.012 .029558 .237064 .313821 ------------------------------------------------------------------------------ (*) dy/dx is for discrete change of dummy variable from 0 to 1 . mfx, predict(p outcome(1)) diagnostics(drop) nodrop Predict into observation 1 = .48179251 Predict into obs 1 after drop = .48179251 All e(sample) observations kept: N = 615 Marginal effects after mlogit y = Pr(insure==1) (predict, p outcome(1)) = .48179251 ------------------------------------------------------------------------------ variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X ---------+-------------------------------------------------------------------- age | .0028073 .00148 1.90 0.058 -.000096 .005711 44.4683 male*| -.1347111 .04683 -2.88 0.004 -.226494 -.042929 .250407 nonwhite*| -.2138472 .05074 -4.21 0.000 -.313297 -.114397 .196748 site2*| .0096603 .05082 0.19 0.849 -.089942 .109263 .370732 site3*| .1333108 .05294 2.52 0.012 .029558 .237064 .313821 ------------------------------------------------------------------------------ (*) dy/dx is for discrete change of dummy variable from 0 to 1

The results are the same, as we would expect, but if you run this example,
you will notice that **mfx** takes much longer to run with the
**nodrop** option. So we would rarely want to specify this option.

We use the **nodrop** option to specify that **mfx** keep all the
observations in memory during its calculations. Let's see how it works:

. webuse sysdsn3, clear (Health insurance data) . mlogit insure age male nonwhite site2 site3, nolog Multinomial logistic regression Number of obs = 615 LR chi2(10) = 42.99 Prob > chi2 = 0.0000 Log likelihood = -534.36165 Pseudo R2 = 0.0387 ------------------------------------------------------------------------------ insure | Coef. Std. Err. z p>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- Prepaid | age | -.011745 .0061946 -1.90 0.058 -.0238862 .0003962 male | .5616934 .2027465 2.77 0.006 .1643175 .9590693 nonwhite | .9747768 .2363213 4.12 0.000 .5115955 1.437958 site2 | .1130359 .2101903 0.54 0.591 -.2989296 .5250013 site3 | -.5879879 .2279351 -2.58 0.010 -1.034733 -.1412433 _cons | .2697127 .3284422 0.82 0.412 -.3740222 .9134476 -------------+---------------------------------------------------------------- Uninsure | age | -.0077961 .0114418 -0.68 0.496 -.0302217 .0146294 male | .4518496 .3674867 1.23 0.219 -.268411 1.17211 nonwhite | .2170589 .4256361 0.51 0.610 -.6171725 1.05129 site2 | -1.211563 .4705127 -2.57 0.010 -2.133751 -.2893747 site3 | -.2078123 .3662926 -0.57 0.570 -.9257327 .510108 _cons | -1.286943 .5923219 -2.17 0.030 -2.447872 -.1260135 ------------------------------------------------------------------------------ (Outcome insure==Indemnity is the comparison group) . mfx, predict(p outcome(1)) diagnostics(drop) Predict into observation 1 = .48179251 Predict into obs 1 after drop = .48179251 Keep first e(sample) observation only. Marginal effects after mlogit y = Pr(insure==1) (predict, p outcome(1)) = .48179251 ------------------------------------------------------------------------------ variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X ---------+-------------------------------------------------------------------- age | .0028073 .00148 1.90 0.058 -.000096 .005711 44.4683 male*| -.1347111 .04683 -2.88 0.004 -.226494 -.042929 .250407 nonwhite*| -.2138472 .05074 -4.21 0.000 -.313297 -.114397 .196748 site2*| .0096603 .05082 0.19 0.849 -.089942 .109263 .370732 site3*| .1333108 .05294 2.52 0.012 .029558 .237064 .313821 ------------------------------------------------------------------------------ (*) dy/dx is for discrete change of dummy variable from 0 to 1 . mfx, predict(p outcome(1)) diagnostics(drop) nodrop Predict into observation 1 = .48179251 Predict into obs 1 after drop = .48179251 All e(sample) observations kept: N = 615 Marginal effects after mlogit y = Pr(insure==1) (predict, p outcome(1)) = .48179251 ------------------------------------------------------------------------------ variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X ---------+-------------------------------------------------------------------- age | .0028073 .00148 1.90 0.058 -.000096 .005711 44.4683 male*| -.1347111 .04683 -2.88 0.004 -.226494 -.042929 .250407 nonwhite*| -.2138472 .05074 -4.21 0.000 -.313297 -.114397 .196748 site2*| .0096603 .05082 0.19 0.849 -.089942 .109263 .370732 site3*| .1333108 .05294 2.52 0.012 .029558 .237064 .313821 ------------------------------------------------------------------------------ (*) dy/dx is for discrete change of dummy variable from 0 to 1

The results are the same, as we would expect, but if you run this example,
you will notice that **mfx** takes much longer to run with the
**nodrop** option. So we would rarely want to specify this option.