Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Interaction and squared effects in a probit (pa) model

From   Maarten buis <>
To   stata list <>
Subject   st: RE: Interaction and squared effects in a probit (pa) model
Date   Mon, 11 Apr 2011 10:56:43 +0100 (BST)

--- Andrea and Laura wrote me privately:
> We´re working on a model and, when trying to solve some econometrical 
> issues, we found your name on the statalist and thought you may be 
> able to help us.

The rule is that you ask questions to statalist and not to individual

> We´re estimating a panel data model with 90 cross-section observations 
> (country pairs) and 10 time series observations (years), using the 
> -xtgee- command with the  -family(bin) link(probit) corr(ar1) robust 
> force- options.
> We recently included interactive terms in our model and we´re finding 
> difficulties in estimating the correspondent marginal effects, as the 
> command -margins- is not suitable for nonlinear estimations, 
> especially when our variables of interest are combined with each other.
> To make things even more complex, we´ve also got variables interacted 
> with themselves (quadratic terms).
> Searching for a command that would be suitable in our case, we found 
> -inteff- (, 
> but are a little confused because of the mention of the squared 
> variables.

-margins- is actually exactly right when you want the marginal effect
after a model that includes square terms. As you can see in the first
part of the example below, the marginal effect returned by -margins-
corresponds exactly with the marginal effect computed by hand.

The real question should be: Do you really want marginal effects? 
Marginal effects can be thought of as a linear model on top of your
previous model. In the graph below we can see that the predicted
probabilities follow a strong non-linear pattern. This begs the 
question: Do you believe that there can be a single straight line 
that can meaningfully summarize the pattern in the predicted 

For the example below my answer would: no, that pattern is just too
non-linear. This should come as no surprise. The quadratic term 
was added because we believed there to be substantial non-linearity.
So either we believe that a linear line is a good-enough 
approximation, in which case we can use marginal effects but it
raises the question why we added the quadratic term. Or we believe
that the non-linearity is substantial, which means that the quadratic
term may be justified, but now marginal effects loose their meaning.
If you are in the latter case I would add a footnote to the table
of marginal effects saying that the effect is just too non-linear to 
be meaningfully summarized by marginal effects and leave that cell 
them empty in the table. Than I would add a graph of the predicted 
probability against that variable.

*-------------------- begin example -------------------------
sysuse auto, clear
recode rep78 1/2=3
probit foreign c.mpg##c.mpg i.rep78

// do it with margins
margins, dydx(*) at(mpg=20 rep78=4)

// do it by hand
tempname xb
scalar `xb' = _b[_cons] + _b[mpg]*20 + _b[c.mpg#c.mpg]*400 + ///
di normalden(`xb')* (_b[mpg] + 2*20*_b[c.mpg#c.mpg])

//============================= do you really want marginal effects?

// create predicted probabilities by repair status
predict pr
separate pr, by(rep78)

// create the "regression lines" implied by marginal effects
scalar `xb' = _b[_cons] + _b[mpg]*20 + _b[c.mpg#c.mpg]*400 

local b3 = normalden(`xb') * ///
           (_b[mpg] + 2*20*_b[c.mpg#c.mpg])
local c3 = normal(`xb')-20*`b3'
sum mpg if rep78 == 3, meanonly
local l3 = r(min)
local u3 = r(max)

local b4 = normalden(`xb' + _b[4.rep78])*  ///
           (_b[mpg] + 2*20*_b[c.mpg#c.mpg])
local c4 = normal(`xb' + _b[4.rep78])-20*`b4'
sum mpg if rep78 == 4, meanonly
local l4 = r(min)
local u4 = r(max)

local b5 = normalden(`xb' + _b[5.rep78])*  ///
           (_b[mpg] + 2*20*_b[c.mpg#c.mpg])
local c5 = normal(`xb' + _b[5.rep78])-20*`b5'
sum mpg if rep78 == 5, meanonly
local l5 = r(min)
local u5 = r(max)

// display them in a graph
twoway line pr3 mpg, sort lpattern(solid) lcolor(black) || ///
       function y = `c3' + `b3'*x,                         ///
       range(`l3' `u3') lpattern(solid)  lcolor(gs8)    || ///
       line pr4 mpg, sort lpattern(dash) lcolor(black)  || ///
       function y = `c4' + `b4'*x,                         ///
       range(`l4' `u4') lpattern(dash) lcolor(gs8)      || ///
       line pr5 mpg, sort lpattern(shortdash)              ///
            lcolor(black)                               || ///
       function y = `c5' + `b5'*x,                         ///
       range(`l5' `u5') lpattern(shortdash) lcolor(gs8)    ///
       ytitle(predicted probability) xline(20)             ///
       xtitle(miles per gallon)                            ///
       legend( cols(1) pos(4)                              /// 
              order( - "probit predictions"                ///
                     1 "rep78=3"                           ///
                     3 "rep87=4"                           ///
                     5 "rep87=5"                           ///
                     - "marginal effects"                  ///
                       `""predictions""'                   ///
                     2 "rep78=3"                           ///
                     4 "rep87=4"                           ///
                     6 "rep87=5"  ))    
*------------------ end example ------------------------
(For more on examples I sent to the Statalist see: )

Hope this helps,

Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index