Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

re: Re: st: Interpreting mediation model sobel goodman test

From   "Ariel Linden. DrPH" <>
To   <>
Subject   re: Re: st: Interpreting mediation model sobel goodman test
Date   Wed, 19 Oct 2011 11:34:12 -0400

John gives an excellent tutorial here on a problem with the standard (Baron
and Kenny) mediation approach.

I spent the last few days reading through the Stata v12 manual on -sem- and
example 7 provides an example of how mediation can be modeled. Also the
UCLA website has an example of mediation using sem with multiple mediators
(see problem 4 and 5)

One question that I have about using sem for mediation is if the model can
handle different types of variables (ie., continuous, binary or ordinal
mediator coupled with continuous, binary or ordinal outcome)? The manual
does not explicitly discuss this issue (for mediation or any other
framework). Perhaps the 'standardization' option comes into play here? This
is one of the biggest limitations of the Baron-Kenny approach because they
treat all mediators and outcomes as continuous variables. 


Date: Wed, 19 Oct 2011 08:24:53 +0200
From: John Antonakis <>
Subject: Re: st: Interpreting mediation model sobel goodman test

Hi Meredith:

To answer the first question, 2SLS is the default estimator when 
estimating mediation models in econometrics. It is almost unheard of in 
some other social sciences (e.g., management research, psychology), see:

Antonakis, J., Bendahan, S., Jacquart, P., & Lalive, R. (2010). On 
making causal claims: A review and recommendations. The Leadership
Quarterly, 21(6). 1086-1120.

Foster, E. M., & McLanahan, S. (1996). An Illustration of the Use of 
Instrumental Variables: Do neighborhood conditions affect a young 
person's change of finishing high school? Psychological Methods, 1(3), 

Gennetian, L. A., Magnuson, K., & Morris, P. A. (2008). From statistical 
associations to causation: What developmentalists can learn from 
instrumental variables techniques coupled with experimental data. 
Developmental Psychology, 44(2), 381-394.

That this estimator is not used is not because there is something wrong 
with the estimator; it is probably because the fields that don't use it 
ignore the problem that 2SLS can address (i.e,. the problem of endogeneity).

You really need to look at the difference between how OLS and 
instrumental-variable estimators estimate systems of equations. If you 
run the code I showed you below, the OLS method does not recover the 
correct estimates when the mediator is endogenous. For some types of 
models, SUR estimators, unless iterated (i.e., using maximum likelihood) 
will not produce the correct estimates either.  See the latter part of 
my podcast too, when I show with simulated data (and using ballantines) 
why an instrumental-variable estimator is required with endogenous 

We show with really nice examples how estimates can change in the 
chapter I cited below.

If you have multiple mediators you should have multiple exogenous 
variables (1 for each mediator) at the least, to have an identified 
system of equations, at least when using least-squares estimators (2SLS, 
3SLS; these are "safe bet" estimators because they are limited 
information estimators; if there is a mispecification in one part of 
this model the bias will not be spread in other parts--for more complex 
models, that is).  You're better off being overidentified (more 
instruments than mediators) so that you can test the veracity of the 
constraints in the model. We explain all this in the chapter.


Prof. John Antonakis
Faculty of Business and Economics
Department of Organizational Behavior
University of Lausanne
Internef #618
CH-1015 Lausanne-Dorigny
Tel ++41 (0)21 692-3438
Fax ++41 (0)21 692-3305

Associate Editor
The Leadership Quarterly

On 18.10.2011 23:24, Meredith T. Niles wrote:
 > Hi John, Thank you so much for your response this is very helpful.
 > I was wondering whether two stage least squares is common in
 > estimating mediation models? Most of what I found in code for stata
 > was all sgmediation (which I did in fact use).
 > I am also beginning to run multiple mediation models (with the sureg
 > command). Would it be wise to run multiple mediation models with a
 > different code as well?
 > Best, Meredith
 > -----Original Message----- From:
 > [] On Behalf Of John
 > Antonakis Sent: Tuesday, October 18, 2011 12:11 PM To:
 > Subject: Re: st: Interpreting
 > mediation model sobel goodman test
 > Hi Meredith:
 > I assume you used the -sgmediation- package; I would not use this
 > routine UNLESS your mediator is exogenous (and you are sure of this).
 > If
 > it is endogenous sgmedation will give you inconsistent estimates (it
 > estimates the system of equations with OLS, and uses the dated
 > Baron-Kenny methods); you do not tackle the endogeneity problem with
 > sgmediation. You need to estimate your system of equations with an
 > instrumental-variable estimator (e.g., 2SLS).
 > Take a look at this podcast, where I discuss this problem in detail:
 > Endogeneity: An inconvenient truth (full version) (about 32 minutes
 > in length)
 > If you just want the nitty gritty see:
 > Endogeneity: An inconvenient truth (for researchers) (Excludes the
 > "gentle introduction" content and discusses the two-stage least
 > squares estimator straight away; about 16 minutes in length)
 > See also:
 > Antonakis, J., Bendahan, S., Jacquart, P., & Lalive, R. (submitted).
 > Causality and endogeneity: Problems and solutions. In D.V. Day
 > (Ed.), The Oxford Handbook of Leadership and Organizations.
To understand exactly the nature of the problem run the following code,
 > where x is endogenous with respect to y:
 > clear set seed 123 set obs 1000 gen x = rnormal() gen e = rnormal()
 > gen m = e + .5*x + rnormal() gen y = .5*m - e + rnormal() reg3 (y =
 > m) (m = x), 2sls nlcom [m]x*[y]m sgmediation y, mv(m) iv(x)
 > From the above model, we have an instrument x, an endogenous
 > regressor m, and omitted cause e, and a dependent variable y. We know
 > that the indirect effect of x on y is .5*.5=.25. 2SLS recovers this
 > parameter well (.24, p>.001). However, the sgmediation program gives
 > .03 (and p = .04).
 > Now, let's rerun this to see when you'd get the same results with
 > sgmediation (if x is exogenous with respect to y):
 > clear set seed 123 set obs 1000 gen x = rnormal() gen e = rnormal()
 > gen m = .5*x + rnormal() gen y = .5*m + rnormal() reg3 (y = m) (m =
 > x), 2sls nlcom [m]x*[y]m reg3 (y = m) (m = x), ols nlcom [m]x*[y]m
 > sgmediation y, mv(m) iv(x)
 > Notice that the 2SLS model is still consistent (but less efficient).
 > The
 > OLS estimator and sgmediation pretty much give the same estimates and
 > standard errors.
 > HTH, John.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index