Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: -mfx- after poisson

From (Vince Wiggins, StataCorp)
Subject   Re: st: -mfx- after poisson
Date   Mon, 01 May 2006 11:45:23 -0500

Scott Cunningham <> using -mfx- to compute the marginal
effects from a poisson regression with many (over 150) indicator variables.
He notes that -mfx- is taking a long, long time to compute the marginal

Richard Williams <> suggested that Scott consult
an FAQ on -mfx- speed at

That FAQ makes several recommendations, in particular using the -varlist()-
option to restrict the calculation of marginal effects to the variables of

The primary reason why Scott's problem takes so long is the number of
indicator variables.  Here's why.  

-mfx- takes a brute-force approach to computing marginal effects and computes
numerical derivatives for whatever is requested, rather than hand-coding
analytic derivatives for only a few functions.  This gives it incredible
flexibility.  For example it lets you compute the marginal effect of a
bivariate probit model w.r.t. the joint probability of two successes, the
marginal probability of a success in either outcome, or any of the other 10
statistics that are predicted for -biprobit- models.  -mfx- can even compute
marginal effects for user-written estimators, so long as those estimators
supply -predict-ions for the statistics of interest.

The price for this flexibility is performance.  Usually that price is not too
high, but in the case of many indicator variables it can be.

Why are indicators different from continuous variables?  

That is a longish story, but at its core is the fact that -mfx- reduces its
computational burden substantially by using the chain rule.

Most estimators combine just a few nonlinear terms and those terms themselves
are just linear combinations (which in Stata we call equations).  The Poisson
model that Scott is estimating has just a single equation that linearly
combines the coefficients B and the covariates x -- xB.  The expected number
of events for poisson is just a nonlinear function of the single term xB --

Using the chain rule we get

              d(f(xB))     d(f(xB)    d(Xb)      d(f(xB)
     mfx_x =  --------  =  ------  *  -----  =   -------  * B_x
              d(x)         d(Xb)      d(x)       d(Xb)

So, we can numerically compute d(f(xB) / d(Xb) once, and it can be easily
applied to each x_i by multiplying by the associated B_i.  

In Scott's case this means that we do some hard work once and then apply it to
the 180 or so coefficients.

These are just the first derivatives that are themselves the marginal effects.
To compute standard errors of the marginal effects, we must compute second
derivatives and cross-derivatives among all of the coefficients.  If we have K
covariates, there are K(K+1)/2 second and cross derivatives.  Luckily, a
similar chain rule can be applied.

In Scott's case, this means that we do not have to numerically compute
180(180+1)/2 = 16,290 second and cross derivatives.  Just a few will do and
the chain rule will give us the rest.

But what does all this have to do with indicator variables?

By default, -mfx- computes the effect of a discrete change in an indicator --
the effect of the indicator going from 0 to 1.  This is usually what we want.
If the indicator is male vs. female, we are comparing the difference between
males and females.  This is usually what we want, but not always.  We might be
interested in the instantaneous increase in the proportion of females in a
group and in that case we could specify the -nodiscrete- option to obtain the
instantaneous marginal effect of the indicator -- the effect that we showed
above and what -mfx- computes for continuous variables.

The discrete marginal effect is different, it is

     dmfx_xi =  f(xB|x_i=0) - f(xB|x_i=1)

Because xB is now evaluated at two completely different points, we cannot use
the chain rule.  All of the derivatives, second derivatives, and
cross-derivatives for the discrete change computed separately.  In Scott's
case this is about 16,000 derivatives.

If you want instantaneous derivatives, you are in luck.  Specify the
-nodiscrete- option.  For a problem similar to Scott's this takes about 15
seconds on my 3.2 gHz Pentium.  Otherwise, ask only for the marginal effects
you want, especially if those effects are for indicator variables and you want
the discrete effect.

-- Vince

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index