Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Interpretation of interaction with dummy in OLS

 From frone@ria.buffalo.edu To statalist@hsphsun2.harvard.edu Subject Re: st: Interpretation of interaction with dummy in OLS Date Fri, 27 Aug 2010 17:38:30 -0400

```Susan,

Probing the conditional effects of an interaction is straightforward with
a dichotomous (dummy) moderator, and it is easy with multinomial
moderators and continuous moderators (but these require the use of
centering around various points on the continuous moderator).

Before looking at your questions, consider the example below.  I have no
idea what D is and how it was scored--hopefully 0 and 1, not 1 and 2.  A
1, 2 scoring makes the conditional effect not meaningful.  So, for this
example, I'll say that D is gender.  Also assume that we have two versions
of D:

Dm: 0=male, 1 = female
Df: 0=female, 1 = male.

So you have:

M1: Y = a + b_IV + c_Dm + e

M2: Y = a + b_IV + c_Dm + d_IV*Dm + e

b in model 1 is the main effect (or what one might call the marginal or
overall relation) IV to Y.  Another way to think about it is that it is
the average of the conditional relations of IV to Y across the two values
of D.  If you substitute Df for Dm in model 1, you get the same magnitude
effect--the same overall effect for D--except that it would be opposite in
sign.

Now move to model 2.  Because the crossproduct is in the model, the
coefficient b is no longer an overall relation, it is a conditional effect
or relation. The coefficient b represents the relation of IV to Y when Dm
is equal to zero.  Given that males are scored zero on Dm,  the
coefficient b represents the relation of IV to Y for males.

Now re-estimate model 2, supplementing Df in place of Dm:

M3: Y = a + b_IV + c_Df + d_IV*Df+ e

In model 3, the coefficient b represents the relation of IV to Y for
females (because they are scores zero on Df), and you get the standard
error for this conditional effect.

So by creating two versions of D, each with a 0 and 1 scoring, you can
estimate and test the two conditional effects comprising the significant
interaction involving D. Of course, you only need to estimate model 2 or
model 3 to determine if the interaction is significant or not.

Also, keep in mind that just estimating and testing conditional effects is
often not enough.  One needs to plot the interaction to see its overall
form.  It is easy to come up with examples, to use your variables, when
one expects IV to be unrelated to Y when D is high and positively related
to Y when D is low.  One can find these conditional relations, but when
plotted, the form of the interaction is not as theory might predict.  It
is not enough that one conditional effect is zero and the other is
positive, where the two slopes fall relative to one another on the
distribution of Y scores can be very important.

Now for your questions,

> I tested the model
>
> Y = a + b_IV + c_D + d_IV*D + e
>
> IV is log-transformed.
> D is a dummy.
> Y is not log-transformed.
>
> Comparing the model without the interaction (i.e. Y = a + b_IV + c_D +
e) to
> the model with interaction (Y = a + b_IV + c_D + d_IV*D + e) yields the
>
> following results for the coefficients:
>
> b changes from 5 (model without interaction) to -6 (not significant in
both
> models)

Given what I said above, coefficient b in the model without the
interaction is the overall effect of IV (i.e.,average of the conditional
effects of IV across the two values of D).  Coefficient b in the model
with the interaction is the condition effect of IV on Y when D is equal to
zero. But if D was scored, 1, 2, then the value of this b, the conditional
effect, is meaningless because a score of zero on the current version of D
is meaningless.  Hopefully, given the example above, this should be clear.

> c changes from 12 (significant) to -12 (not significant in model with
> interaction)
> interaction coefficient for model with interaction is 42 and
significant.

Given what is implied in what I said above, coefficient c in the model
without the interaction is the overall effect of D on Y (ie., average of
the conditional effects of D across all values of IV).  Coefficient c in
the model with the interaction is the conditional effect of D when IV is
equal to zero. But if zero is not a valid value for IV, then the
coefficient c has no meaningful interpretation when the interaction is in
the model.  To probe and test the conditional effects of D across values
of IV would required the use of centering and rerunning the equation
multiple times like the example above.

> I have the following questions:
> (1) Does the significant and positive interaction term imply that
> the effect of
> the logged IV on Y is positive and significant when the dummy is 1?

Read what I said above.

> (2) If I want to test whether the logged IV moderates the
> relationship between
> the dummy and Y, is the following interpretation right?
>        - dummy c has positive and significant relation with Y
>     - logged IV positively moderates the relationship between dummy and
Y
> (interaction term positive and significant)?

It doesn't make a lot of sense to compare coefficients for b and c from
the models with and without the interaction. And depending on the scoring
of D and IV, it may be totally meaningless.  So test the interaction.  If
it is significant, estimate and test the conditional effects in which you
are interested.  If you want the conditional effects of IV for each value
of D, I showed you what to do.  If you want the conditional effects of D
at various values of IV, then you need to learn how to using centering.
Finally, whichever way you want to look at the interaction, plot it.

>(3) What are possible explanations for the dummy turning negative and
> insignificant when the interaction with logged IV is entered? May it
> be that the > interaction  between dummy and logged IV, rather than only
the dummy, has a
> relationship with Y?

As you can see from what I've said, it is because the coefficient c
changes from representing an overall relation for D without the
interaction to representing the conditional effect of D when IV equals
zero.

> (4) Is there any way to give an idea how large the moderation effectof
dummy
> and logged IV are on Y? (i.e. total effect = c+d  or  c+b+d?)

Well, you have the increment in R-sq attributable to the interaction.  But
better yet, plot the interaction and you'll quite readily see if it has
any practical value or even makes any sense.

If you'd like to do some basic reading on testing and interpreting
interactions in linear models, I might suggest:

Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and
interpreting interactions. Thousand Oaks, Sage.

Aquinis, H. (2004). Regression analysis for categorical moderators.  New
York: Guilford Press.

Mike Frone

****************************************************************
Michael R. Frone, Ph.D.
Senior Research Scientist
Research Institute on Addictions
State University of New York at Buffalo
1021 Main Street
Buffalo, New York 14203

Office:    716-887-2519
Fax:        716-887-2477
E-mail:     frone@ria.buffalo.edu
Internet: http://www.ria.buffalo.edu/profiles/frone.html
***************************************************************

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```