Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Interaction model
Cameron McIntosh <firstname.lastname@example.org>
STATA LIST <email@example.com>
RE: st: Interaction model
Wed, 8 Feb 2012 22:16:15 -0500
It sounds like this is an exploratory data analysis situation involving potentially numerous interactions. An automated approach might not be a bad idea either:
Buckler, F., & Hennig-Thurau, T. (2008). Identifying Hidden Structures in Marketing’s Structural Models Through Universal Structure Modeling: An Explorative Bayesian Neural Network Complement to LISREL and PLS. Marketing -- Journal of Research and Management, 4(2), 47-66.http://www.neusrel.com/index.html
> Date: Wed, 8 Feb 2012 21:55:37 -0500
> Subject: Re: st: Interaction model
> From: firstname.lastname@example.org
> To: email@example.com
> You're welcome, Shikha.
> It will be helpful to reproduce model (a), correcting the typo:
> (a) income= b1*program + b2*rich + b3*immi + b4*male + b5*program*rich
> +b6*program*male + b7*program*immi
> If I were, mechanically, to sketch an interpretation of b1, it would
> say that b1 gives the effect of the program on income, adjusting for
> the contributions of [the other six predictors]. Unfortunately, if
> the interaction effects are significant, it is not meaningful to
> interpret a main effect in the presence of interactions between that
> variable and other variables. And in model (a) each of the four
> variables in involved in at least one two-factor interaction. Thus,
> the model would be saying that the effect of the program differed
> between rich and poor, between immigrants and non-immigrants, and
> between males and females; and you would need to start with the
> average income in each of those subgroups and discuss the comparisons.
> A weighted average over the groups might be useful.
> You have not explained why model (a) does not contain a constant term,
> which we could denote by b0.
> In such an analysis, if you have enough data, it would make sense to
> start with the "saturated" model, which would contain b0 and also the
> terms rich*male, rich*immi, and immi*male, program*rich*immi,
> program*rich*male, program*immi*male, rich*immi*male, and
> program*rich*immi*male (for a total of 16 predictors). It might then
> be possible to eliminate some of the interactions, starting with the
> highest-order and working down. (If a given interaction is
> significant, however, the model must retain all the lower-order terms
> associated with the variables involved in that interaction.)
> The easiest model to interpret is the additive model, which would
> contain b0 and only the main effects for the four variables.
> Departures from additivity often arise when the response variable is
> not yet expressed in a suitable scale. In your analysis, data on
> income are often skewed, and they behave better when transformed to a
> logarithmic scale. I wonder whether analyzing income in the log scale
> would lead to an analysis in which the contributions are more nearly
> additive. Then, transforming back to the original scale would produce
> effects that are multiplicative.
> David Hoaglin
> > b4 is not the coefficient for both male and program*rich- it was a mistake/typo.
> > I understand the model in (a) is a richer model compared to different
> > specifications in (b). What would be the interpretation of b1 in (a)?
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: