| Title | Marginal effects and the nodrop option | |
| Author | May Boggess, StataCorp | |
| Date | April 2004 |
In calculating a marginal effect, mfx must use the predict command to obtain the values of the prediction function. The predict command works by creating a new variable and putting the predicted value for each observation into that new variable. So, the prediction goes into the dataset.
How many observations does mfx need to predict into in order to function properly? Well, most of the time, the answer is one, and to save computation time, mfx temporarily drops all the observations except the first one in the e(sample), if it is safe to do so.
How do you know if mfx has concluded that it is safe to work with only one observation? You can use the diagnostics(drop) option. Let's look at an example:
. webuse friedman2, clear
. keep if tin( ,1981q4)
(67 observations deleted)
. arima consump m2, ar(1) ma(1) nolog
ARIMA regression
Sample: 1959q1 to 1981q4 Number of obs = 92
Wald chi2(3) = 4394.80
Log likelihood = -340.5077 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| OPG
consump | Coef. Std. Err. z p>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
consump |
m2 | 1.122029 .0363563 30.86 0.000 1.050772 1.193286
_cons | -36.09872 56.56703 -0.64 0.523 -146.9681 74.77062
-------------+----------------------------------------------------------------
ARMA |
ar |
L1 | .9348486 .0411323 22.73 0.000 .8542308 1.015467
ma |
L1 | .3090592 .0885883 3.49 0.000 .1354293 .4826891
-------------+----------------------------------------------------------------
/sigma | 9.655308 .5635157 17.13 0.000 8.550837 10.75978
------------------------------------------------------------------------------
. mfx, predict(xb structural) diagnostics(drop)
Predict into observation 1 = 828.33238
Predict error after drop.
note: nodrop option enforced.
All e(sample) observations kept: N = 92
Marginal effects after arima
y = xb prediction, structural one-step (predict, xb structural)
= 828.33239
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
m2 | 1.122029 .03636 30.86 0.000 1.05077 1.19329 770.418
------------------------------------------------------------------------------
We see that we get a predict error after keeping only one observation. Let's use predict by itself and see if the same thing happens:
. webuse friedman2, clear
. keep if tin( ,1981q4)
(67 observations deleted)
. quietly arima consump m2, ar(1) ma(1) nolog
. keep if e(sample)
(52 observations deleted)
. keep in 1
(91 observations deleted)
. predict xb, xb structural
Obs. nos. out of range
r(198);
If we rerun predict with set trace on, we see that it is referring to the other observations, which are now not there. So, in this example, it is certainly safest to keep all observations in memory during mfx. A marginal effect is a derivative of a function. By using the predict option of mfx, we specify the function for which we would like marginal effects. If none was specified, the default prediction option for the preceding estimation command is used.
In calculating a marginal effect, mfx must use the predict command to obtain the values of the prediction function. The predict command works by creating a new variable and putting the predicted value for each observation into that new variable. So, the prediction goes into the dataset.
How many observations does mfx need to predict into in order to function properly? Well, most of the time, the answer is one, and to save computation time, mfx temporarily drops all the observations except the first one in the e(sample), if it is safe to do so.
How do you know if mfx has concluded that it is safe to work with only one observation? You can use the diagnostics(drop) option. Let's look at an example:
. webuse friedman2, clear
. keep if tin( ,1981q4)
(67 observations deleted)
. arima consump m2, ar(1) ma(1) nolog
ARIMA regression
Sample: 1959q1 to 1981q4 Number of obs = 92
Wald chi2(3) = 4394.80
Log likelihood = -340.5077 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| OPG
consump | Coef. Std. Err. z p>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
consump |
m2 | 1.122029 .0363563 30.86 0.000 1.050772 1.193286
_cons | -36.09872 56.56703 -0.64 0.523 -146.9681 74.77062
-------------+----------------------------------------------------------------
ARMA |
ar |
L1 | .9348486 .0411323 22.73 0.000 .8542308 1.015467
ma |
L1 | .3090592 .0885883 3.49 0.000 .1354293 .4826891
-------------+----------------------------------------------------------------
/sigma | 9.655308 .5635157 17.13 0.000 8.550837 10.75978
------------------------------------------------------------------------------
. mfx, predict(xb structural) diagnostics(drop)
Predict into observation 1 = 828.33238
Predict error after drop.
note: nodrop option enforced.
All e(sample) observations kept: N = 92
Marginal effects after arima
y = xb prediction, structural one-step (predict, xb structural)
= 828.33239
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
m2 | 1.122029 .03636 30.86 0.000 1.05077 1.19329 770.418
------------------------------------------------------------------------------
We see that we get a predict error after keeping only one observation. Let's use predict by itself and see if the same thing happens:
. webuse friedman2, clear
. keep if tin( ,1981q4)
(67 observations deleted)
. quietly arima consump m2, ar(1) ma(1) nolog
. keep if e(sample)
(52 observations deleted)
. keep in 1
(91 observations deleted)
. predict xb, xb structural
Obs. nos. out of range
r(198);
If we rerun predict with set trace on, we see that it is referring to the other observations, which are now not there. So, in this example, it is certainly safest to keep all observations in memory during mfx.
We use the nodrop option to specify that mfx keep all the observations in memory during its calculations. Let's see how it works:
. webuse sysdsn3, clear
(Health insurance data)
. mlogit insure age male nonwhite site2 site3, nolog
Multinomial logistic regression Number of obs = 615
LR chi2(10) = 42.99
Prob > chi2 = 0.0000
Log likelihood = -534.36165 Pseudo R2 = 0.0387
------------------------------------------------------------------------------
insure | Coef. Std. Err. z p>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Prepaid |
age | -.011745 .0061946 -1.90 0.058 -.0238862 .0003962
male | .5616934 .2027465 2.77 0.006 .1643175 .9590693
nonwhite | .9747768 .2363213 4.12 0.000 .5115955 1.437958
site2 | .1130359 .2101903 0.54 0.591 -.2989296 .5250013
site3 | -.5879879 .2279351 -2.58 0.010 -1.034733 -.1412433
_cons | .2697127 .3284422 0.82 0.412 -.3740222 .9134476
-------------+----------------------------------------------------------------
Uninsure |
age | -.0077961 .0114418 -0.68 0.496 -.0302217 .0146294
male | .4518496 .3674867 1.23 0.219 -.268411 1.17211
nonwhite | .2170589 .4256361 0.51 0.610 -.6171725 1.05129
site2 | -1.211563 .4705127 -2.57 0.010 -2.133751 -.2893747
site3 | -.2078123 .3662926 -0.57 0.570 -.9257327 .510108
_cons | -1.286943 .5923219 -2.17 0.030 -2.447872 -.1260135
------------------------------------------------------------------------------
(Outcome insure==Indemnity is the comparison group)
. mfx, predict(p outcome(1)) diagnostics(drop)
Predict into observation 1 = .48179251
Predict into obs 1 after drop = .48179251
Keep first e(sample) observation only.
Marginal effects after mlogit
y = Pr(insure==1) (predict, p outcome(1))
= .48179251
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
age | .0028073 .00148 1.90 0.058 -.000096 .005711 44.4683
male*| -.1347111 .04683 -2.88 0.004 -.226494 -.042929 .250407
nonwhite*| -.2138472 .05074 -4.21 0.000 -.313297 -.114397 .196748
site2*| .0096603 .05082 0.19 0.849 -.089942 .109263 .370732
site3*| .1333108 .05294 2.52 0.012 .029558 .237064 .313821
------------------------------------------------------------------------------
(*) dy/dx is for discrete change of dummy variable from 0 to 1
. mfx, predict(p outcome(1)) diagnostics(drop) nodrop
Predict into observation 1 = .48179251
Predict into obs 1 after drop = .48179251
All e(sample) observations kept: N = 615
Marginal effects after mlogit
y = Pr(insure==1) (predict, p outcome(1))
= .48179251
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
age | .0028073 .00148 1.90 0.058 -.000096 .005711 44.4683
male*| -.1347111 .04683 -2.88 0.004 -.226494 -.042929 .250407
nonwhite*| -.2138472 .05074 -4.21 0.000 -.313297 -.114397 .196748
site2*| .0096603 .05082 0.19 0.849 -.089942 .109263 .370732
site3*| .1333108 .05294 2.52 0.012 .029558 .237064 .313821
------------------------------------------------------------------------------
(*) dy/dx is for discrete change of dummy variable from 0 to 1
The results are the same, as we would expect, but if you run this example, you will notice that mfx takes much longer to run with the nodrop option. So we would rarely want to specify this option.
We use the nodrop option to specify that mfx keep all the observations in memory during its calculations. Let's see how it works:
. webuse sysdsn3, clear
(Health insurance data)
. mlogit insure age male nonwhite site2 site3, nolog
Multinomial logistic regression Number of obs = 615
LR chi2(10) = 42.99
Prob > chi2 = 0.0000
Log likelihood = -534.36165 Pseudo R2 = 0.0387
------------------------------------------------------------------------------
insure | Coef. Std. Err. z p>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Prepaid |
age | -.011745 .0061946 -1.90 0.058 -.0238862 .0003962
male | .5616934 .2027465 2.77 0.006 .1643175 .9590693
nonwhite | .9747768 .2363213 4.12 0.000 .5115955 1.437958
site2 | .1130359 .2101903 0.54 0.591 -.2989296 .5250013
site3 | -.5879879 .2279351 -2.58 0.010 -1.034733 -.1412433
_cons | .2697127 .3284422 0.82 0.412 -.3740222 .9134476
-------------+----------------------------------------------------------------
Uninsure |
age | -.0077961 .0114418 -0.68 0.496 -.0302217 .0146294
male | .4518496 .3674867 1.23 0.219 -.268411 1.17211
nonwhite | .2170589 .4256361 0.51 0.610 -.6171725 1.05129
site2 | -1.211563 .4705127 -2.57 0.010 -2.133751 -.2893747
site3 | -.2078123 .3662926 -0.57 0.570 -.9257327 .510108
_cons | -1.286943 .5923219 -2.17 0.030 -2.447872 -.1260135
------------------------------------------------------------------------------
(Outcome insure==Indemnity is the comparison group)
. mfx, predict(p outcome(1)) diagnostics(drop)
Predict into observation 1 = .48179251
Predict into obs 1 after drop = .48179251
Keep first e(sample) observation only.
Marginal effects after mlogit
y = Pr(insure==1) (predict, p outcome(1))
= .48179251
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
age | .0028073 .00148 1.90 0.058 -.000096 .005711 44.4683
male*| -.1347111 .04683 -2.88 0.004 -.226494 -.042929 .250407
nonwhite*| -.2138472 .05074 -4.21 0.000 -.313297 -.114397 .196748
site2*| .0096603 .05082 0.19 0.849 -.089942 .109263 .370732
site3*| .1333108 .05294 2.52 0.012 .029558 .237064 .313821
------------------------------------------------------------------------------
(*) dy/dx is for discrete change of dummy variable from 0 to 1
. mfx, predict(p outcome(1)) diagnostics(drop) nodrop
Predict into observation 1 = .48179251
Predict into obs 1 after drop = .48179251
All e(sample) observations kept: N = 615
Marginal effects after mlogit
y = Pr(insure==1) (predict, p outcome(1))
= .48179251
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z p>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
age | .0028073 .00148 1.90 0.058 -.000096 .005711 44.4683
male*| -.1347111 .04683 -2.88 0.004 -.226494 -.042929 .250407
nonwhite*| -.2138472 .05074 -4.21 0.000 -.313297 -.114397 .196748
site2*| .0096603 .05082 0.19 0.849 -.089942 .109263 .370732
site3*| .1333108 .05294 2.52 0.012 .029558 .237064 .313821
------------------------------------------------------------------------------
(*) dy/dx is for discrete change of dummy variable from 0 to 1
The results are the same, as we would expect, but if you run this example, you will notice that mfx takes much longer to run with the nodrop option. So we would rarely want to specify this option.