Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Testing interaction terms when using factor variables

From	Jeffrey Pitblado <[email protected]>
To	[email protected]
Subject	Re: st: Testing interaction terms when using factor variables
Date	Fri, 28 Oct 2011 10:51:17 -0500

Richard Williams is testing the effects of interactions of factor variables
with other continuous variables, and had a follow-up question about
-contrast-:

>> In Stata 12, Richard can use the new -contrast- command:
>>
>> . contrast female#(c.year##c.year c.select c.articles c.prestige), overall
>>
>> The -overall- option specifies that -contrast- combine the tests of the
>> individually specified terms into an additional "Overall" test at the end
>> of the Wald table.
>>
>> Otherwise, Richard can use -testparm- instead of -test-.
>>
>> . testparm 1.female#(c.year c.year#c.year c.select c.articles c.prestige)
>
> Excellent! Thanks. I notice that the output of the contrast command includes
>
> Margins : asbalanced
>
> Is there any reason I would want to override that? I also tried
>
> contrast 1.female#(c.year c.year#c.year c.select c.articles c.prestige),
>       overall asobserved
>
> and it didn't seem to change anything other than saying "Margins:
> asobserved" rather than "Margins: asbalanced".

The -contrast- command is a vehicle for producing ANOVA style Wald tests on
factor variables and their interactions; it provides a rich syntax for
testing complicated linear combinations of the fitted model coefficients.  For
-contrast- , the linear combinations (contrasts) can be thought of as simple
differences between marginal predictions.

If the data are balanced, then the marginal predictions can be computed
without regard for the observed cell frequencies of the underlying factor
variables, since the observed cell frequencies will not affect the computed
margins.  Without further information from the researcher, inference on the
population (data generating process) is limited to this "as balanced" scenario
because that is how the data were collected.

If the data are unbalanced, then the marginal predictions can be computed
"as balanced", or they can be computed "as observed" where the observed cell
frequencies are also accounted for.  If the observed cell frequencies are
representative of the population (data generating process) then the researcher
will most likely prefer to compute the marginal predictions "as observed" in
order to draw inferences directly onto the population (data generating
process).

If the cell frequencies are unbalanced but do not vary by much, then the
-contrast- results will not differ by much between the "as balanced" and "as
observed" scenarios.

For further discussion with examples illustrating the difference between "as
balanced" and "as observed" I refer you to the 'Unbalanced data' subsection
within the 'Remarks' section of [R] contrast.

--

--Jeff Pitblado
  [email protected]
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Testing interaction terms when using factor variables
  - From: Richard Williams <[email protected]>
- Re: st: Testing interaction terms when using factor variables
  - From: Jeffrey Pitblado <[email protected]>
- Re: st: Testing interaction terms when using factor variables
  - From: Richard Williams <[email protected]>

Prev by Date: RE: st: moderated mediation logistic outcome
Next by Date: Re: st: RE: cleaning data efficiently
Previous by thread: Re: st: Testing interaction terms when using factor variables
Next by thread: st: Re: Successive rounds of a simulation become slower and slower
Index(es):
- Date
- Thread