[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
re: st: Extremely poor performance in repeated ANOVA
Michael Ingre wrote:
Please send me the data set if it's not private, and I will run on my
Powerbook to compare times. I'm curious about this. I have a 1.25 GHz
I have tried fitting a repeated measures anova in Stata and I was
surprisingly disappointed with the performance. My dataset contains 17
subjects observed 20 times a day during three different days. It is a
two-factor repeated measures ANOVA with a total of 1020 observations.
. anova dv subject day / subject*day time / subject*time day*time
I timed it this morning and Stata/SE (8.2) took 7 minutes 30 seconds to
complete the analyses!!!!
Well, comparison with SAS was not really fair, because SAS was using
Proc Mixed, which computes the problem differently. But it is true that
for the kind of problem above, Stata's ANOVA routines are not worth
My computer is not the fastest in the world (PowerBook G4, 800Mhz,
RAM) but SPSS run the same model in seconds!!! (SPSS report 2 seconds
processor time but there is some overhead). And my experience from
models in SPSS and StatView (StatView does not calculate epsilon) over
last five of years or so, is that it should run in seconds rather than
minutes even if the model is considerably larger.
At first I thought this was a bug or a mistake of mine however, I found
another thread on the list describing a similar experience that was
suggested to be a bug, nor a mistake. David Airey describes a somewhat
larger ANOVA (38400 obs, 2 between, 4 within) that did not finish
hours but was (supposedly) run in 30 seconds in SAS.
In general, most statisticians I've consulted with will prefer a mixed
model approach as implemented in SAS Proc Mixed or S-Plus/R LME/NLME
for repeated measures designs with between subject factors also
present. If those are not available, the MANOVA approaches are
preferred, which lead to more readily to valid post-hoc tests (Stata
now has MANOVA routines). Least favored seems to be univariate repeated
measures ANOVA, although it is not the least used as far as I can see!
The education of statistics users lags what statisticians use.
I would appreciate any comment from listers with experience in repeated
Also, with regard to post-hoc testing, although the epsilon corrected
omnibus tests may be OK using repeated measures ANOVA, this may not be
true for post-hoc tests. If you use mixed model ANOVAs a lot, complete
with post-hoc tests, you need to consider your options:
As for me, the more I use Stata, the more I like it, but the more I
mess around with statistics, the more tools I wind up exploring (Data
Desk, Stata, and R, so far).
For biologists using statistics, the main weaknesses of Stata are
currently a lack of a routine like SAS Mixed or R LME/NLME, an
exploratory graphics (vector?) engine for 3-D rotation (brushing,
linked active plots, etc., and maybe more exploratory multivariate
techniques (so I've heard some say). You can see from this why I have
explored Data Desk and R.
And I would especially appreciate an official comment from StataCorp.
this the performance we should expect? Is Stata planning to improve the
performance or is anova a low priority procedure in Stata?
For me this is a bit of a drawback experience because I have spoken
about Stata to my colleagues (for very good reasons of course) and I
like us to switch completely to Stata and use it as our standard
But I'm quite sure my colleagues would not accept the performance I
experienced. We use ANOVA a lot and to switch to Stata for ANOVA would
like moving 10 years back in time.
* For searches and help try: