Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Testing interaction terms when using factor variables

From   Jeffrey Pitblado <jpitblado@STATA.COM>
Subject   Re: st: Testing interaction terms when using factor variables
Date   Fri, 28 Oct 2011 10:51:17 -0500

Richard Williams is testing the effects of interactions of factor variables
with other continuous variables, and had a follow-up question about

>> In Stata 12, Richard can use the new -contrast- command:
>> . contrast female#(c.year##c.year c.articles c.prestige), overall
>> The -overall- option specifies that -contrast- combine the tests of the
>> individually specified terms into an additional "Overall" test at the end
>> of the Wald table.
>> Otherwise, Richard can use -testparm- instead of -test-.
>> . testparm 1.female#(c.year c.year#c.year c.articles c.prestige)
> Excellent! Thanks. I notice that the output of the contrast command includes
> Margins : asbalanced
> Is there any reason I would want to override that? I also tried
> contrast 1.female#(c.year c.year#c.year c.articles c.prestige),
>       overall asobserved
> and it didn't seem to change anything other than saying "Margins:
> asobserved" rather than "Margins: asbalanced".

The -contrast- command is a vehicle for producing ANOVA style Wald tests on
factor variables and their interactions; it provides a rich syntax for
testing complicated linear combinations of the fitted model coefficients.  For
-contrast- , the linear combinations (contrasts) can be thought of as simple
differences between marginal predictions.

If the data are balanced, then the marginal predictions can be computed
without regard for the observed cell frequencies of the underlying factor
variables, since the observed cell frequencies will not affect the computed
margins.  Without further information from the researcher, inference on the
population (data generating process) is limited to this "as balanced" scenario
because that is how the data were collected.

If the data are unbalanced, then the marginal predictions can be computed
"as balanced", or they can be computed "as observed" where the observed cell
frequencies are also accounted for.  If the observed cell frequencies are
representative of the population (data generating process) then the researcher
will most likely prefer to compute the marginal predictions "as observed" in
order to draw inferences directly onto the population (data generating

If the cell frequencies are unbalanced but do not vary by much, then the
-contrast- results will not differ by much between the "as balanced" and "as
observed" scenarios.

For further discussion with examples illustrating the difference between "as
balanced" and "as observed" I refer you to the 'Unbalanced data' subsection
within the 'Remarks' section of [R] contrast.


--Jeff Pitblado
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index