[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Why does -anova- calculate repeated effects (F) twice?
Michael Ingre <Michael.Ingre@ipm.ki.se> asks:
> I have now made several repeated -anova- and noticed some odd
> behaviour in the procedure. When -anova- is run with the repeated
> option, it first calculates the standard anova table (including
> F), then for each effect with repeated variables it calculates
> the epsilon coefficients. But then, it seems like Stata starts
> calculating the effects (with F) once again. Is this really
> It is my understanding that once the epsilon coefficients are
> calculated you only need to run the F-test once again with the
> corrected degrees of freedom. There is no need to recalculate F.
> This is also how it is explained in the Stata manual p51.
> Given the time it takes to compute F in repeated designs
> (sometimes minutes and even hours as has been discussed lately on
> this list) and the apparently very fast calculation of epsilon
> (Stata displays it almost immediately) it seems to me that the
> speed of -anova- with repeated measures could almost be doubled
> if the second calculation of F were to be omitted.
You are correct that a speed improvement is possible here. It is
something I have thought about and will some day implement.
The main reason why Stata's -anova- is doing some of the
computations twice has to do with the way the code was structured
long ago when results were returned as _result(#) instead of in
e() or r(). (Some list members will remember those days. How
many of you remember typing -disp_res- and then either going back
to the manuals to figure out what _result(6) was or trying to
guess it by matching the number with the numbers in the output.)
The number of items returned in _result() was limited, and so the
mean squares, F stats, etc. for each term in the ANOVA model were
not returned. (Remember that there can be hundreds of terms in
an -anova- -- think of binary right-hand-side variables with lots
of interaction terms.) The mean squares and F stats for each
term were recomputed as needed on replay or when the -test-
command was invoked for a particular term. By the time the
repeated measures part of the -anova- starts producing results,
the intermediate results for each term have already passed away
and are recomputed as needed.
In contrast to -anova-, the -manova- command that was added in
Stata 8 returns detailed results for each term in the model. It
did not begin its life under the old (limited) _result(#) way of
Some time in the future I would like to alter -anova- to return
information (mean squares etc.) for each term in the -anova-.
After I do this the table presented for the repeated measures
corrections will not need to recompute the F stat, and will be
presented very quickly. Also, replay of an -anova- would then be
Ken Higbee email@example.com
* For searches and help try: