Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Interaction terms

From   Maarten buis <>
To   "" <>
Subject   Re: st: Interaction terms
Date   Tue, 3 May 2011 14:59:28 +0100 (BST)

On Tue, May 3, 2011 at 3:05 PM, lreine ycenna  wrote:
> (1) I'm running a regression, wanting to see the effect of overseas living
> experience (OV) on one's income (y) which is also determined by eudcation
> level (edu) and wealth.
> regress y ov edu wealth ovxedu ovxwealth.
> I also want to see the effect of overseas experience on y with a mixed
> variable:  edu and wealth.
> If I run regress y ov edu wealth eduxwealth ovxedu ovxwealth ovxeduxwealth,
> I might run into collinearity problems when I include more variables in the
> future.

There is no such thing as a (multi-)collinearity problem(*). Strong correlations
between explanatory variables will increase the standard error, but that is
exactly what should happen: strong correlation means it is hard to distinguish
one variable from the other which is necessary in order to separate the
effects of the two variables. So the increase in standard error is an
accurate representation of the amount of information available. If you think
you need the threeway interactions, than you should include them
regardless of what it does to your standard errors. In that case you will just
have to live with the fact that your test will have little statistical
power, i.e.
your are very unlikely to find a significant result. Remember that a non-
significant result implies an "absence of evidence" not "evidence of absence",
in other words a non-significant results means "we do not know" rather than
"the effect is 0". This is especially important with tests with little

> or can I run a separate regression such as regress y ov wealthxedu
> ovxwealthxedu? Which one is correct?

The only one who can answer that question is you: you are the researcher
so you decide what model is a correct representation of your theory.

> (2) when I run regress y ov edu wealth eduxwealth ovxedu ovxwealth
> ovxeduxwealth, OV has a negative coefficient,
>  but it changes to positive when I run regress y ov wealthxedu
> ovxwealthxedu. What does this suppose to mean?

With interaction the main effect of ov means the effect of ov when wealth
and edu are both equal to zero. If you did not center wealth and edu
before creating the interaction terms, that result is probably pretty

> (3) I have very large standard errors (ranged from 0-14) when running
> both equations. Does this mean that the results are very 'imprecise'?

To determine whether a standard error is large or small we also need to
know the size of the effects, but large standard errors/imprecise results
are to be expected when you include interactions.

Hope this helps,

(*) For that reason Gujarati (1995, chapter 10) called the multi-colinearity
problem the "micro-numerosity" problem: correlation among explanatory
variables costs statistical power to the point that we just don't have enough
observations to find what we set out to find. We cannot change the
correlation among explanatory variables, so the only thing we can do
is to gather more data, i.e. the problem is not the correlations but the
too small dataset, hence micro-numerosity (Sorry, a joke tend to loose its
edge when you explain it).

Damodar N. Gujarati (1995) Basic Econometrics, third edition. McGraw-Hill.

Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index