Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Why does -anova- calculate repeated effects (F) twice?


From   khigbee@stata.com
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Why does -anova- calculate repeated effects (F) twice?
Date   Mon, 09 Feb 2004 12:59:43 -0600

Michael Ingre <Michael.Ingre@ipm.ki.se> asks:

> I have now made several repeated -anova- and noticed some odd
> behaviour in the procedure. When -anova- is run with the repeated
> option, it first calculates the standard anova table (including
> F), then for each effect with repeated variables it calculates
> the epsilon coefficients. But then, it seems like Stata starts
> calculating the effects (with F) once again. Is this really
> necessary?
> 
> It is my understanding that once the epsilon coefficients are
> calculated you only need to run the F-test once again with the
> corrected degrees of freedom. There is no need to recalculate F.
> This is also how it is explained in the Stata manual p51.
> 
> Given the time it takes to compute F in repeated designs
> (sometimes minutes and even hours as has been discussed lately on
> this list) and the apparently very fast calculation of epsilon
> (Stata displays it almost immediately) it seems to me that the
> speed of -anova- with repeated measures could almost be doubled
> if the second calculation of F were to be omitted.

You are correct that a speed improvement is possible here.  It is
something I have thought about and will some day implement.

The main reason why Stata's -anova- is doing some of the
computations twice has to do with the way the code was structured
long ago when results were returned as _result(#) instead of in
e() or r().  (Some list members will remember those days.  How
many of you remember typing -disp_res- and then either going back
to the manuals to figure out what _result(6) was or trying to
guess it by matching the number with the numbers in the output.)

The number of items returned in _result() was limited, and so the
mean squares, F stats, etc. for each term in the ANOVA model were
not returned.  (Remember that there can be hundreds of terms in
an -anova- -- think of binary right-hand-side variables with lots
of interaction terms.)  The mean squares and F stats for each
term were recomputed as needed on replay or when the -test-
command was invoked for a particular term.  By the time the
repeated measures part of the -anova- starts producing results,
the intermediate results for each term have already passed away
and are recomputed as needed.

In contrast to -anova-, the -manova- command that was added in
Stata 8 returns detailed results for each term in the model.  It
did not begin its life under the old (limited) _result(#) way of
returning results.

Some time in the future I would like to alter -anova- to return
information (mean squares etc.) for each term in the -anova-.
After I do this the table presented for the repeated measures
corrections will not need to recompute the F stat, and will be
presented very quickly.  Also, replay of an -anova- would then be
almost instantaneous.

Ken Higbee    khigbee@stata.com
StataCorp     1-800-STATAPC

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index