Home  /  Resources & support  /  FAQs  /  Obtaining marginal effects after estimations with interactions
Note: This FAQ is for Stata 10 and older versions of Stata.

In Stata 11, the margins command replaced mfx.

I am using a model with interactions. How can I obtain marginal effects and their standard errors?

Title   Obtaining marginal effects and their standard errors after estimations with interactions
Author May Boggess, StataCorp

The marginal effect of an independent variable is the derivative (that is, the slope) of a given function of the covariates and coefficients of the preceding estimation. The derivative is evaluated at a point that is usually, and by default, the means of the covariates.

The mfx command assumes that the variables in the estimation are independent. Let’s start with an example. Say that we have the quadratic model

  mpg = b0 + b1*weight + b2*weight^2

which we fit with regress. The marginal effect of the linear predictor is the derivative of this function with respect to weight, evaluated at the average weight. Calling the mean weight meanwei, we have

  dydx = b1 + 2*b2*meanwei

If we ran mfx after regress with this model, Stata would not know that the second covariate was a function of the first, so it would calculate the derivative to be just b1.

 . sysuse auto, clear
 (1978 Automobile Data)
 
 . generate wei2=weight*weight
 
 . regress mpg weight wei2
 
       Source |       SS       df       MS              Number of obs =      74
 -------------+------------------------------           F(  2,    71) =   72.80
        Model |  1642.52197     2  821.260986           Prob > F      =  0.0000
     Residual |  800.937487    71  11.2808097           R-squared     =  0.6722
 -------------+------------------------------           Adj R-squared =  0.6630
        Total |  2443.45946    73  33.4720474           Root MSE      =  3.3587
 
 ------------------------------------------------------------------------------
          mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
       weight |  -.0141581   .0038835    -3.65   0.001    -.0219016   -.0064145
         wei2 |   1.32e-06   6.26e-07     2.12   0.038     7.67e-08    2.57e-06
        _cons |   51.18308   5.767884     8.87   0.000     39.68225    62.68392
 ------------------------------------------------------------------------------
 
 . mfx
 
 Marginal effects after regress
       y  = Fitted values (predict)
          =  21.297297
 ------------------------------------------------------------------------------
 variable |      dy/dx    Std. Err.     z    P>|z|  [    95% C.I.   ]      X
 ---------+--------------------------------------------------------------------
   weight |  -.0141581      .00388   -3.65   0.000   -.02177 -.006546   3019.46
     wei2 |   1.32e-06      .00000    2.12   0.034   9.8e-08  2.6e-06   9.7e+06
 ------------------------------------------------------------------------------

To obtain the marginal effects of such a function of the covariates we can use nlcom. This will not be as easy as using mfx because nlcom can't do the differentiation for us, as mfx does. The technique we are about to use relies on our knowing the derivative of the function beforehand, as we did in our example above.

The reason we want to use nlcom is that it calculates the standard error of the marginal effect by the delta method. We can also use predictnl in the same way since it is also designed to use the delta method to obtain standard errors.

Here are some examples. Let’s start with out first example from above:

 . sysuse auto, clear
 (1978 Automobile Data)
 
 . generate wei2=weight*weight
 
 . regress mpg weight wei2
 
       Source |       SS       df       MS              Number of obs =      74
 -------------+------------------------------           F(  2,    71) =   72.80
        Model |  1642.52197     2  821.260986           Prob > F      =  0.0000
     Residual |  800.937487    71  11.2808097           R-squared     =  0.6722
 -------------+------------------------------           Adj R-squared =  0.6630
        Total |  2443.45946    73  33.4720474           Root MSE      =  3.3587
 
 ------------------------------------------------------------------------------
          mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
       weight |  -.0141581   .0038835    -3.65   0.001    -.0219016   -.0064145
         wei2 |   1.32e-06   6.26e-07     2.12   0.038     7.67e-08    2.57e-06
        _cons |   51.18308   5.767884     8.87   0.000     39.68225    62.68392
 ------------------------------------------------------------------------------
 
 . quietly summarize weight if e(sample)
 
 . local meanwei = r(mean)
 
 . display "   y = " _b[weight]* `meanwei'  +  _b[wei2]*`meanwei'^2  + _b[_cons] 
  y = 20.50813

 . display "dydx = " _b[weight]+ 2*_b[wei2]* `meanwei'  
  dydx = -.00616011

 . nlcom _b[weight]+ 2*_b[wei2]* `meanwei'   

    _nl_1:  _b[weight]+ 2*_b[wei2]* 3019.45945945946

 ------------------------------------------------------------------------------
      mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
    _nl_1 |  -.0061601   .0005108   -12.06   0.000    -.0071787   -.0051415
 ------------------------------------------------------------------------------

 . predictnl dydx =  _b[weight]+ 2*_b[wei2]* `meanwei'   in 1, se(se)
 (73 missing values generated)
 Warning: prediction constant over observations; perhaps you meant to run nlcom.

 . list dydx se in 1

  +----------------------+
  |      dydx         se |
  |----------------------|
. | -.0061601   .0005108 |
  +----------------------+

Here is an example using probit with two covariates and their interaction. There are three derivatives we can obtain with this model. We can obtain the two first derivatives, the marginal effect of each variable, and the second derivative (the probability of success differentiated with respect to both variables), which is called the interaction effect.

 . sysuse auto, clear
 (1978 Automobile Data)

 . replace weight=weight/1000
 weight was int now float
 (74 real changes made)

 . replace length=length/10
 length was int now float
 (74 real changes made)

 . generate wl=weight*length

 . probit foreign weight length wl, nolog

 Probit estimates                                  Number of obs   =         74
                                                   LR chi2(3)      =      33.47
                                                   Prob > chi2     =     0.0000
 Log likelihood =  -28.29759                       Pseudo R2       =     0.3716

 ------------------------------------------------------------------------------
      foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
       weight |   2.662482   4.160075     0.64   0.522    -5.491116    10.81608
       length |   .6118874   .6413161     0.95   0.340     -.645069    1.868844
           wl |  -.2324718   .2263927    -1.03   0.304    -.6761934    .2112498
        _cons |  -7.115107   10.56744    -0.67   0.501    -27.82691    13.59669
 ------------------------------------------------------------------------------
 Note: 1 failure and 0 successes completely determined.

 . quietly summarize weight if e(sample)

 . local meanwei = r(mean)

 . quietly summarize length if e(sample)

 . local meanlen = r(mean)

 . local xb _b[weight]*`meanwei'  + /*
 >         */  _b[len]*`meanlen' + _b[wl]*`meanwei'*`meanlen' + _b[_cons]

 . predictnl dydw =  normalden(`xb')*(_b[weight]+ _b[wl]*`meanlen') in 1, se(sew)
 (73 missing values generated)
 Warning: prediction constant over observations; perhaps you meant to run nlcom.

 . list dydw sew in 1

   +----------------------+
   |      dydw        sew |
   |----------------------|
1. | -.5068093   .2714958 |
   +----------------------+

 . predictnl dydl = normalden(`xb')*(_b[len]+ _b[wl]*`meanwei') in 1, se(sel)
 (73 missing values generated)
 Warning: prediction constant over observations; perhaps you meant to run nlcom.

 . list dydl sel in 1

   +----------------------+
   |      dydl        sel |
   |----------------------|
1. | -.0267456   .0940725 |
   +----------------------+

 . predictnl dydlw =normalden(`xb')*(-(`xb'))*(_b[weight]+ _b[wl]*`meanlen') /*
 >         */ *(_b[len]+ _b[wl]*`meanwei') /*
 >         */ +  normalden(`xb')*( _b[wl]) in 1, se(selw)
 (73 missing values generated)
 Warning: prediction constant over observations; perhaps you meant to run nlcom.

 . list dydlw selw in 1

   +----------------------+
   |     dydlw       selw |
   |----------------------|
1. | -.0339839   .1040486 |
   +----------------------+

How do we handle a dummy variable? The marginal effect for a dummy variable is not obtained by differentiation but as a difference of the predicted value at 1 and the predicted value at 0. Here is an example of a logit model with an interaction, where one variable is a dummy.

 . sysuse auto, clear
 (1978 Automobile Data)

 . set seed 12345

 . generate dum=uniform()>0.5

 . table dum

 ----------------------
       dum |      Freq.
 ----------+-----------
         0 |         37
         1 |         37
 ----------------------

 . generate td=turn*dum

 . probit foreign turn dum td, nolog

 Probit estimates                                  Number of obs   =         74
                                                   LR chi2(3)      =      40.64
                                                   Prob > chi2     =     0.0000
 Log likelihood = -24.712342                       Pseudo R2       =     0.4512

 ------------------------------------------------------------------------------
      foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
 -------------+----------------------------------------------------------------
         turn |  -.4162473   .1256398    -3.31   0.001    -.6624968   -.1699978
          dum |  -6.545885    5.52614    -1.18   0.236    -17.37692    4.285149
           td |   .1644299   .1498474     1.10   0.273    -.1292656    .4581253
        _cons |   15.41459   4.641172     3.32   0.001     6.318063    24.51112
 ------------------------------------------------------------------------------

 . quietly summarize turn if e(sample)

 . local meantur= r(mean)

 . quietly summarize dum if e(sample)

 . local meandum = r(mean)

 . local xb _b[turn]*`meantur'  + /*
 >         */  _b[dum]*`meandum' + _b[td]*`meantur'*`meandum' + _b[_cons]

 . predictnl dydt =  normalden(`xb')*(_b[turn] + _b[td]*`meandum') in 1, se(set)
 (73 missing values generated)
 Warning: prediction constant over observations; perhaps you meant to run nlcom.

 . list dydt set in 1

   +----------------------+
   |      dydt        set |
   |----------------------|
1. | -.0725871   .0163552 |
   +----------------------+

 . local xb1 _b[turn]*`meantur'  + /*
 >         */  _b[dum]*1 + _b[td]*`meantur'*1 + _b[_cons]

 . local xb0 _b[turn]*`meantur'  + /*
 >         */  _b[dum]*0+ _b[td]*`meantur'*0 + _b[_cons]

 . predictnl dydd = normal(`xb1')-normal(`xb0') in 1, se(sed)
 (73 missing values generated)
 Warning: prediction constant over observations; perhaps you meant to run nlcom.

 . list dydd sed in 1

   +----------------------+
   |      dydd        sed |
   |----------------------|
1. | -.0057505   .1292351 |
   +----------------------+

 . predictnl dyddt =normalden(`xb1')*(_b[turn] + _b[td]) /*
 >         */     - normalden(`xb0')*(_b[turn]) in 1, se(sedt)
 (73 missing values generated)
 Warning: prediction constant over observations; perhaps you meant to run nlcom.

 . list dyddt sedt in 1

   +---------------------+
   |    dyddt       sedt |
   |---------------------|
1. | .0378494   .0348098 |
   +---------------------+