Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
frone@ria.buffalo.edu |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Interpretation of interaction with dummy in OLS |

Date |
Fri, 27 Aug 2010 17:38:30 -0400 |

Susan, Probing the conditional effects of an interaction is straightforward with a dichotomous (dummy) moderator, and it is easy with multinomial moderators and continuous moderators (but these require the use of centering around various points on the continuous moderator). Before looking at your questions, consider the example below. I have no idea what D is and how it was scored--hopefully 0 and 1, not 1 and 2. A 1, 2 scoring makes the conditional effect not meaningful. So, for this example, I'll say that D is gender. Also assume that we have two versions of D: Dm: 0=male, 1 = female Df: 0=female, 1 = male. So you have: M1: Y = a + b_IV + c_Dm + e M2: Y = a + b_IV + c_Dm + d_IV*Dm + e b in model 1 is the main effect (or what one might call the marginal or overall relation) IV to Y. Another way to think about it is that it is the average of the conditional relations of IV to Y across the two values of D. If you substitute Df for Dm in model 1, you get the same magnitude effect--the same overall effect for D--except that it would be opposite in sign. Now move to model 2. Because the crossproduct is in the model, the coefficient b is no longer an overall relation, it is a conditional effect or relation. The coefficient b represents the relation of IV to Y when Dm is equal to zero. Given that males are scored zero on Dm, the coefficient b represents the relation of IV to Y for males. Now re-estimate model 2, supplementing Df in place of Dm: M3: Y = a + b_IV + c_Df + d_IV*Df+ e In model 3, the coefficient b represents the relation of IV to Y for females (because they are scores zero on Df), and you get the standard error for this conditional effect. So by creating two versions of D, each with a 0 and 1 scoring, you can estimate and test the two conditional effects comprising the significant interaction involving D. Of course, you only need to estimate model 2 or model 3 to determine if the interaction is significant or not. Also, keep in mind that just estimating and testing conditional effects is often not enough. One needs to plot the interaction to see its overall form. It is easy to come up with examples, to use your variables, when one expects IV to be unrelated to Y when D is high and positively related to Y when D is low. One can find these conditional relations, but when plotted, the form of the interaction is not as theory might predict. It is not enough that one conditional effect is zero and the other is positive, where the two slopes fall relative to one another on the distribution of Y scores can be very important. Now for your questions, > I tested the model > > Y = a + b_IV + c_D + d_IV*D + e > > IV is log-transformed. > D is a dummy. > Y is not log-transformed. > > Comparing the model without the interaction (i.e. Y = a + b_IV + c_D + e) to > the model with interaction (Y = a + b_IV + c_D + d_IV*D + e) yields the > > following results for the coefficients: > > b changes from 5 (model without interaction) to -6 (not significant in both > models) Given what I said above, coefficient b in the model without the interaction is the overall effect of IV (i.e.,average of the conditional effects of IV across the two values of D). Coefficient b in the model with the interaction is the condition effect of IV on Y when D is equal to zero. But if D was scored, 1, 2, then the value of this b, the conditional effect, is meaningless because a score of zero on the current version of D is meaningless. Hopefully, given the example above, this should be clear. > c changes from 12 (significant) to -12 (not significant in model with > interaction) > interaction coefficient for model with interaction is 42 and significant. Given what is implied in what I said above, coefficient c in the model without the interaction is the overall effect of D on Y (ie., average of the conditional effects of D across all values of IV). Coefficient c in the model with the interaction is the conditional effect of D when IV is equal to zero. But if zero is not a valid value for IV, then the coefficient c has no meaningful interpretation when the interaction is in the model. To probe and test the conditional effects of D across values of IV would required the use of centering and rerunning the equation multiple times like the example above. > I have the following questions: > (1) Does the significant and positive interaction term imply that > the effect of > the logged IV on Y is positive and significant when the dummy is 1? Read what I said above. > (2) If I want to test whether the logged IV moderates the > relationship between > the dummy and Y, is the following interpretation right? > - dummy c has positive and significant relation with Y > - logged IV positively moderates the relationship between dummy and Y > (interaction term positive and significant)? It doesn't make a lot of sense to compare coefficients for b and c from the models with and without the interaction. And depending on the scoring of D and IV, it may be totally meaningless. So test the interaction. If it is significant, estimate and test the conditional effects in which you are interested. If you want the conditional effects of IV for each value of D, I showed you what to do. If you want the conditional effects of D at various values of IV, then you need to learn how to using centering. Finally, whichever way you want to look at the interaction, plot it. >(3) What are possible explanations for the dummy turning negative and > insignificant when the interaction with logged IV is entered? May it > be that the > interaction between dummy and logged IV, rather than only the dummy, has a > relationship with Y? As you can see from what I've said, it is because the coefficient c changes from representing an overall relation for D without the interaction to representing the conditional effect of D when IV equals zero. > (4) Is there any way to give an idea how large the moderation effectof dummy > and logged IV are on Y? (i.e. total effect = c+d or c+b+d?) Well, you have the increment in R-sq attributable to the interaction. But better yet, plot the interaction and you'll quite readily see if it has any practical value or even makes any sense. If you'd like to do some basic reading on testing and interpreting interactions in linear models, I might suggest: Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. Thousand Oaks, Sage. Aquinis, H. (2004). Regression analysis for categorical moderators. New York: Guilford Press. Mike Frone **************************************************************** Michael R. Frone, Ph.D. Senior Research Scientist Research Institute on Addictions State University of New York at Buffalo 1021 Main Street Buffalo, New York 14203 Office: 716-887-2519 Fax: 716-887-2477 E-mail: frone@ria.buffalo.edu Internet: http://www.ria.buffalo.edu/profiles/frone.html *************************************************************** * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Interpretation of interaction with dummy in OLS***From:*P K <statistics_2009@yahoo.de>

- Prev by Date:
**Re: st: Discrete time hazard model-interval censored** - Next by Date:
**Re: st: RE: Heckman with variables that perfectly predict selection** - Previous by thread:
**st: Interpretation of interaction with dummy in OLS** - Index(es):