[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Joseph Coveney <jcoveney@bigplanet.com> |

To |
Statalist <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: GLM and ANOVA complaints |

Date |
Sun, 28 Sep 2003 14:22:23 +0900 |

David Airey mentions that his factorial repeated-measures analysis of variance is taking more than eight hours to finish. David describes his analysis as "384 observations, 2 between subject factors, 4 within subject factors," which if it's balanced would be a 2 × 2 × 2 × 2 × 2 × 2 repeated-measures ANOVA, with last four factors as repeated measurements. I know that -anova- can take a while to complete when there are numerous factors and their interactions to estimate, but eight hours seems long to me for a problem of this size. To see how rapidly this analysis would run on my laptop (2 GHz nominal, 512 megabytes RAM, Windows XP), I created an artificial dataset that mimics David's in what I understand as his experimental design. The do-file is attached below. For reference, the between-subject factors are named prt (pretreatment) and trt (treatment), and the within-subject factors are named alphabetically. The statistical model of the data was fully saturated, that is, with all interaction terms, and I believe although am not certain that I specified it correctly. -anova- took 25 minutes including floppy disc access time to log the output. This is longer, of course, than the 30 seconds claimed for SAS's PROC MIXED, but not hours longer. I did not use (need) a matrix size of 6000, but I doubt that it would have substantially increased the computation time if I did set the matrix size limit that large. Joseph Coveney ------------------------------------------------------------------------------- clear set more off set matsize 2400 set obs 384 set seed 20030928 * First between-subject factor (pretreatment) generate byte prt = _n > _N / 2 * Second between-subject factor (treatment) sort prt generate byte trt = mod(_n, 2) * Subject identifier sort trt prt generate byte pid = mod(_n, 16) == 1 replace pid = sum(pid) tabulate prt trt * Balanced completely randomized factorial design * First within-subjects factor sort pid // Not really necessary generate byte A = mod(_n, 2) * Second within-subjects factor sort pid A generate byte B = mod(_n, 2) * Third within-subjects factor sort pid A B generate byte C = mod(_n, 2) * Fourth within-subjects factor sort pid A B C generate byte D = mod(_n, 2) sort pid A B C D by pid: generate float latent_variable = invnorm(uniform()) if _n == 1 by pid: replace latent_variable = latent_variable[1] generate float dep = 0.7 * latent_variable + (1 - 0.7^2) * invnorm(uniform()) drop latent_variable * Strictly additive (no interactions of any factors) replace dep = dep - prt / 6 + trt / 6 - A / 6 + B / 6 - C / 6 + D / 6 capture log close log using complicated_anova.smcl, replace set rmsg on anova dep prt trt prt*trt / prt*trt|pid /// A prt*A trt*A prt*trt*A / prt*trt*A|pid /// B prt*B trt*B prt*trt*B / prt*trt*B|pid /// A*B prt*A*B trt*A*B prt*trt*A*B / prt*trt*A*B|pid /// C prt*C trt*C prt*trt*C / prt*trt*C|pid /// A*C prt*A*C trt*A*C prt*trt*A*C prt*trt*A*C|pid /// B*C prt*B*C trt*B*C prt*trt*B*C / prt*trt*B*C|pid /// A*B*C prt*A*B*C trt*A*B*C prt*trt*A*B*C / prt*trt*A*B*C|pid /// D prt*D trt*D prt*trt*D / prt*trt*D|pid /// A*D prt*A*D trt*A*D prt*trt*A*D / prt*trt*A*D|pid /// B*D prt*B*D trt*B*D prt*trt*B*D / prt*trt*B*D|pid /// C*D prt*C*D trt*C*D prt*trt*C*D / prt*trt*C*D|pid /// A*B*D prt*A*B*D trt*A*B*D prt*trt*A*B*D / prt*trt*A*B*D|pid /// A*C*D prt*A*C*D trt*A*C*D prt*trt*A*C*D / prt*trt*A*C*D|pid /// B*C*D prt*B*C*D trt*B*C*D prt*trt*B*C*D / prt*trt*B*C*D|pid /// A*B*C*D prt*A*B*C*D trt*A*B*C*D prt*trt*A*B*C*D log close help smileplot exit -------------------------------------------------------------------------------- * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: Re: GLM and ANOVA complaints***From:*Friedrich Huebler <huebler@rocketmail.com>

- Prev by Date:
**Re: st: GLM and ANOVA complaints** - Next by Date:
**st: adjustment of diagnostic Likelihood Ratio's for covariates** - Previous by thread:
**RE: st: GLM and ANOVA complaints** - Next by thread:
**st: Re: GLM and ANOVA complaints** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |