[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

re: st: Extremely poor performance in repeated ANOVA

From	David Airey <[email protected]>
To	[email protected]
Subject	re: st: Extremely poor performance in repeated ANOVA
Date	Tue, 3 Feb 2004 09:55:34 -0600

Michael Ingre wrote:

Dear listers

I have tried fitting a repeated measures anova in Stata and I was
surprisingly disappointed with the performance. My dataset contains 17
subjects observed 20 times a day during three different days. It is a simple
two-factor repeated measures ANOVA with a total of 1020 observations.

. anova dv subject day / subject*day time / subject*time day*time
,repeated(day time)

I timed it this morning and Stata/SE (8.2) took 7 minutes 30 seconds to
complete the analyses!!!!

Please send me the data set if it's not private, and I will run on my Powerbook to compare times. I'm curious about this. I have a 1.25 GHz Powerbook.

My computer is not the fastest in the world (PowerBook G4, 800Mhz, 640MB
RAM) but SPSS run the same model in seconds!!! (SPSS report 2 seconds
processor time but there is some overhead). And my experience from similar
models in SPSS and StatView (StatView does not calculate epsilon) over the
last five of years or so, is that it should run in seconds rather than
minutes even if the model is considerably larger.

At first I thought this was a bug or a mistake of mine however, I found
another thread on the list describing a similar experience that was neither
suggested to be a bug, nor a mistake. David Airey describes a somewhat
larger ANOVA (38400 obs, 2 between, 4 within) that did not finish within 8
hours but was (supposedly) run in 30 seconds in SAS.
http://www.stata.com/statalist/archive/2003-09/msg00598.html

Well, comparison with SAS was not really fair, because SAS was using Proc Mixed, which computes the problem differently. But it is true that for the kind of problem above, Stata's ANOVA routines are not worth using.

I would appreciate any comment from listers with experience in repeated
measures anova.

In general, most statisticians I've consulted with will prefer a mixed model approach as implemented in SAS Proc Mixed or S-Plus/R LME/NLME for repeated measures designs with between subject factors also present. If those are not available, the MANOVA approaches are preferred, which lead to more readily to valid post-hoc tests (Stata now has MANOVA routines). Least favored seems to be univariate repeated measures ANOVA, although it is not the least used as far as I can see! The education of statistics users lags what statisticians use.

Also, with regard to post-hoc testing, although the epsilon corrected omnibus tests may be OK using repeated measures ANOVA, this may not be true for post-hoc tests. If you use mixed model ANOVAs a lot, complete with post-hoc tests, you need to consider your options:

http://www.ats.ucla.edu/stat/stata/faq/compare_packages.htm

As for me, the more I use Stata, the more I like it, but the more I mess around with statistics, the more tools I wind up exploring (Data Desk, Stata, and R, so far).

For biologists using statistics, the main weaknesses of Stata are currently a lack of a routine like SAS Mixed or R LME/NLME, an exploratory graphics (vector?) engine for 3-D rotation (brushing, linked active plots, etc., and maybe more exploratory multivariate techniques (so I've heard some say). You can see from this why I have explored Data Desk and R.

-Dave

And I would especially appreciate an official comment from StataCorp. Is
this the performance we should expect? Is Stata planning to improve the
performance or is anova a low priority procedure in Stata?

For me this is a bit of a drawback experience because I have spoken well
about Stata to my colleagues (for very good reasons of course) and I would
like us to switch completely to Stata and use it as our standard package.
But I'm quite sure my colleagues would not accept the performance I have
experienced. We use ANOVA a lot and to switch to Stata for ANOVA would be
like moving 10 years back in time.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: Problems Stochastic Frontier Analysis
Next by Date: RE: st: Problems Stochastic Frontier Analysis
Previous by thread: st: invalid syntax Error
Next by thread: Re: st: Extremely poor performance in repeated ANOVA
Index(es):
- Date
- Thread