From
khigbee@stata.com

To
statalist@hsphsun2.harvard.edu

Subject
Re: st: Why does -anova- calculate repeated effects (F) twice?

Date
Mon, 09 Feb 2004 12:59:43 -0600

Michael Ingre <Michael.Ingre@ipm.ki.se> asks: > I have now made several repeated -anova- and noticed some odd > behaviour in the procedure. When -anova- is run with the repeated > option, it first calculates the standard anova table (including > F), then for each effect with repeated variables it calculates > the epsilon coefficients. But then, it seems like Stata starts > calculating the effects (with F) once again. Is this really > necessary? > > It is my understanding that once the epsilon coefficients are > calculated you only need to run the F-test once again with the > corrected degrees of freedom. There is no need to recalculate F. > This is also how it is explained in the Stata manual p51. > > Given the time it takes to compute F in repeated designs > (sometimes minutes and even hours as has been discussed lately on > this list) and the apparently very fast calculation of epsilon > (Stata displays it almost immediately) it seems to me that the > speed of -anova- with repeated measures could almost be doubled > if the second calculation of F were to be omitted. You are correct that a speed improvement is possible here. It is something I have thought about and will some day implement. The main reason why Stata's -anova- is doing some of the computations twice has to do with the way the code was structured long ago when results were returned as _result(#) instead of in e() or r(). (Some list members will remember those days. How many of you remember typing -disp_res- and then either going back to the manuals to figure out what _result(6) was or trying to guess it by matching the number with the numbers in the output.) The number of items returned in _result() was limited, and so the mean squares, F stats, etc. for each term in the ANOVA model were not returned. (Remember that there can be hundreds of terms in an -anova- -- think of binary right-hand-side variables with lots of interaction terms.) The mean squares and F stats for each term were recomputed as needed on replay or when the -test- command was invoked for a particular term. By the time the repeated measures part of the -anova- starts producing results, the intermediate results for each term have already passed away and are recomputed as needed. In contrast to -anova-, the -manova- command that was added in Stata 8 returns detailed results for each term in the model. It did not begin its life under the old (limited) _result(#) way of returning results. Some time in the future I would like to alter -anova- to return information (mean squares etc.) for each term in the -anova-. After I do this the table presented for the repeated measures corrections will not need to recompute the F stat, and will be presented very quickly. Also, replay of an -anova- would then be almost instantaneous. Ken Higbee khigbee@stata.com StataCorp 1-800-STATAPC * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

