Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: GLM and ANOVA complaints


From   Joseph Coveney <[email protected]>
To   Statalist <[email protected]>
Subject   Re: st: GLM and ANOVA complaints
Date   Mon, 29 Sep 2003 09:01:16 +0900

David Airey posted:

-------------------------------------------------------------------------------

Since my ANOVA was not finishing anytime soon, and some interesting  
posts were being added to the thread, I broke out and have posted the  
do file and partial results below. You can see that the timing of this  
model is astonishingly slow, which makes me suspicious. The way I wrote  
the model did not leave anything to the residual, but this can be  
modified by moving the term D*T*P*I*A|S*F to the residual. In terms of  
timing to complete computations I don't think this matters. Joseph's  
model is not what I had in mind. By 4 within-subject factors I mean  
repeated measures. In his model there is no repeated() option  
indicating this.

My actual data set has instead of 2 levels of strain, 20 levels of  
strain and 38,400 observations. Such a data set does not compute in  
Stata 8/SE using ANOVA (Previously, Ken Higbee point out that this  
might be another story if use MANOVA, because it leaves the within  
factors and between factors on different sides of the model and a  
smaller X'X requirement.). Monday I meet with a statistician who helps  
run Vanderbilt's VAMPIRE parallel system. We'll determine where SAS's  
desktop limits lie with this data set, first. I've posted here because  
it could be that something is wrong with my computer or do-file that is  
causing this molasses-like performance. I was hoping by posting this  
do-file that some users with fast PC's could run my do file and tell me  
if the problem is my computer or the OS X version of Stata.

I apologize if what gets posted is not in monotype font. It always is  
when I send.

-Dave Airey

--------------------------------------------------------------------------------

A couple of quick observations: (1) David might have trouble with -manova- 
since each mouse has 48 observations worth of covariances to estimate and only 
eight records (animals) to do it with (and even with the full dataset of 20 
strains and 38,400 observations--or 800 animals--it would still appear 
problematic to estimate the 1100-1200 elements in the variance-covariance 
matrix), (2) this might be why the -repeated()- option is having trouble, too, 
since it's trying to estimate deviation from sphericity/compound symmetry in 
the same vein, (3) the ANOVA table was completed and, if David assumes 
sphericity (which he might have to do, given the number records/mice to 
actually assess the assumption), then the analysis has completed as far it can--
and this might be true with any software package that estimates epsilons.  (I 
believe that SAS's PROC MIXED is able to partition the variance-covariance 
matrix into blocks according to the within-subject factor so that fewer off-
diagnonal elements need to be estimated, and this might enable full-rank MANOVA 
to be done here.)

Other than a typographical error in mine (a missing forward slash in one line 
of the model specification), our models are fundamentally the same:  both 
David's and mine are repeated-measures (split-plot) analyses of variance; I 
didn't include the -repeated()- option in my repeated-measures ANOVA because I 
had only two levels for each within-subject factor, rendering the option 
unnecessary.  I've never tried it, so I'm not sure whether a zero residual mean 
square will affect the -repeated()- option's operation.

Joseph Coveney

P.S. The post appears in monotype font in my browser.



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index