Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: My ANOVA and regression results don't agree
From
David Hoaglin <[email protected]>
To
[email protected]
Subject
Re: st: My ANOVA and regression results don't agree
Date
Mon, 6 Jan 2014 20:34:18 -0500
Hi, Jess.
Please say more about the way in which the ANOVA and regression
approaches do not correspond. As Phil mentioned, they produce the
same fitted values. In that sense, the ANOVA and the regression
always agree.
In what sense are the p-values from the regression "wrong"?
It would help me to see more information on your data and your
analyses, such as the output that you got from the various commands.
One consideration is whether the data for the ANOVA are balanced. (If
you assigned subjects randomly to levels of adcontent, but not
randomly to the combinations of adcontent and sex, your data would be
balanced on adcontent, but not on the combination of the two factors.)
Assuming that your data are balanced, with K observations in each of
the 6 cells, the ANOVA decomposes the variation (about the overall
mean) in the data into four sums of squares:
adcontent (2 degrees of freedom)
sex (1 degree of freedom)
adcontent*sex (2 degrees of freedom)
residuals or "error" (6K - 6 degrees of freedom)
(the overall mean accounts for the other degree of freedom).
The usual F-tests are concerned with each factor or interaction as a
whole. For example, (mean square for adcontent)/(error mean square).
The predictors in the regression are indicators that correspond to
individual degrees of freedom:
adcontent2
adcontent3
sex2
adcontent2_sex2
adcontent3_sex2
(and _const, which corresponds to adcontent1 and sex1),
and each predictor has its own p-value. The residual sum of squares
in the regression equals the residual sum of squares in the ANOVA.
When the F-test in an ANOVA says that a factor is "significant," the
null hypothesis being rejected is that the effect of each level of
that factor is 0 (i.e., no differences among the levels). That result
alone does not tell which levels differ significantly from which
others, and all sorts of patterns are possible. For example, with the
effects in the order adcontent1 < adcontent2 < adcontent3, adcontent2
might not differ significantly from adcontent1, and adcontent3 might
not differ from adcontent2. Investigating and summarizing the
patterns has a sizable literature.
I would say that the person who wrote the code on the UCLA site chose
to treat the highest level of each factor as the reference category.
When you use factor variables in Stata, as in i.adcontent in your
-regress- command, the first category is the default for the base
category. You should see that in the output from that -regress-
command.
I hope this helps.
David Hoaglin
On Mon, Jan 6, 2014 at 5:53 PM, Pepper, Jessica <[email protected]> wrote:
> Thanks for sending that link. I followed those instructions and got results that made sense. I just have 2 follow-up questions:
>
> 1. I understand that the 2 approaches (ANOVA and regress/test) don't correspond. When I follow the UCLA procedure that you sent the link to, it confirms what I initially found in the ANOVA and also shows me the contrasts, which is what I really need. All that is great. But why, in essence, should I "trust" the ANOVA over the regression? Why are the p values from the regression wrong?
> 2. The procedure on the UCLA site defaults to treating the highest level of the variable as the reference category. That doesn't matter for my two level variable, but it does for my 3 level variable, correct? And if so, is there an easy way to tell it to treat the lowest level as the reference category? Or should I just manually create a new variable that switches those levels.
>
> I hope these questions make sense. I am new to Stata and have never encountered a situation where ANOVA and regression don't agree.
>
> Many thanks for your help.
> Jess
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/