Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Ploutz-Snyder, Robert (JSC-SK)[USRA]" <robert.ploutz-snyder-1@nasa.gov> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: re: RM ANOVA, was SPSS vs. Stata |

Date |
Mon, 2 Aug 2010 12:29:26 -0500 |

" Doesn't SPSS wrap GLM for its RM-ANOVA routines?" Yes--but with repeated measures designs, SPSS (and SAS, Systat, and BMDP in the old days) use listwise elimination. Stata does not (is there an option in Stata's anova, repeated() code to do so??) " Can you post an example of what you are talking about, re listwise elimination? I don't have SPSS." Here's an example of how Stata fails to ignore/eliminate listwise for a fixed-factorial Repeated Measures ANOVA, compared to SPSS. IN STATA: webuse t43 anova y year, repeated(year) anova score person drug, repeated(drug) Number of obs = 20 R-squared = 0.9244 Root MSE = 3.06594 Adj R-squared = 0.8803 Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 1379 7 197 20.96 0.0000 | person | 680.8 4 170.2 18.11 0.0001 drug | 698.2 3 232.733333 24.76 0.0000 | Residual | 112.8 12 9.4 -----------+---------------------------------------------------- Total | 1491.8 19 78.5157895 Between-subjects error term: person Levels: 5 (4 df) Lowest b.s.e. variable: person Repeated variable: drug Huynh-Feldt epsilon = 1.0789 *Huynh-Feldt epsilon reset to 1.0000 Greenhouse-Geisser epsilon = 0.6049 Box's conservative epsilon = 0.3333 ------------ Prob > F ------------ Source | df F Regular H-F G-G Box -----------+---------------------------------------------------- drug | 3 24.76 0.0000 0.0000 0.0006 0.0076 Residual | 12 ---------------------------------------------------------------- IN SPSS: Tests of Within-Subjects Effects Measure:MEASURE_1 Source Type III Sum of Squares df Mean Square F Sig. drug Sphericity Assumed 698.200 3 232.733 24.759 .000 Greenhouse-Geisser 698.200 1.815 384.763 24.759 .001 Huynh-Feldt 698.200 3.000 232.733 24.759 .000 Lower-bound 698.200 1.000 698.200 24.759 .008 Error(drug) Sphericity Assume 112.800 12 9.400 Greenhouse-Geisser 112.800 7.258 15.540 Huynh-Feldt 112.800 12.000 9.400 Lower-bound 112.800 4.000 28.200 So Stata and SPSS agree on the Repeated Measures F-statistic on Drug--because there is no missing data in this dataset. However, if we eliminate an observation here and there for a couple of subjects, SPSS and Stata fail to agree because Stata does not eliminate or ignore cases listwise. For example IN STATA (using same dataset, but eliminating a couple of obs): replace score = . in 1 /* eliminated person 1's score for drug 1 */ replace score = . in 10 /* eliminated person 3's score for drug 2 */ anova score person drug, repeated(drug) Number of obs = 18 R-squared = 0.9414 Root MSE = 2.9068 Adj R-squared = 0.9004 Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 1357.28267 7 193.897525 22.95 0.0000 | person | 653.704895 4 163.426224 19.34 0.0001 drug | 702.504895 3 234.168298 27.71 0.0000 | Residual | 84.4951049 10 8.44951049 -----------+---------------------------------------------------- Total | 1441.77778 17 84.8104575 Between-subjects error term: person Levels: 5 (4 df) Lowest b.s.e. variable: person Repeated variable: drug Huynh-Feldt epsilon = 0.5297 Greenhouse-Geisser epsilon = 0.4228 Box's conservative epsilon = 0.3333 ------------ Prob > F ------------ Source | df F Regular H-F G-G Box -----------+---------------------------------------------------- drug | 3 27.71 0.0000 0.0019 0.0047 0.0102 Residual | 10 ---------------------------------------------------------------- NOTE that Stata is still using data from all subjects (levels = 5). IN SPSS (same dataset): Tests of Within-Subjects Effects Source Type III Sum of Squares df Mean Square F Sig. drug Sphericity Assumed 478.333 3 159.444 13.932 .004 Greenhouse-Geisser 478.333 1.268 377.157 13.932 .044 Huynh-Feldt 478.333 2.466 193.938 13.932 .008 Lower-bound 478.333 1.000 478.333 13.932 .065 Error(drug) Sphericity Assume 68.667 6 11.444 Greenhouse-Geisser 68.667 2.537 27.071 Huynh-Feldt 68.667 4.933 13.920 Lower-bound 68.667 2.000 34.333 So in this admittedly simple example, SPSS revealed F(3,6) = 13.932, p~.004, whereas Stata shows F = 27.71, which is larger than the original analysis with no missing data. Of course, with a sample size this tiny, we wouldn't trust either analysis. The point is that the prevailing wisdom for fixed-factorial repeated measures ANOVA is to use listwise elimination, and Stata doesn't do this. (And you get the same Stata results if you use the anova command without the repeated option but instead define the error terms manually--a process that is itself painful enough to avoid entirely if you have 2 or 3 factors, especially if more than 1 are repeated.) I appreciate that it is possible to "manually" tell Stata to ignore listwise those subjects who are missing any data... However this can get more complicated when there is more than 1 repeated measures factor (example, drugs a b c, measured pre and post). And... exactly what is Stata's analysis "by default" anyway? I could not write that up as a standard repeated measures ANOVA because it isn't that. To me, a straightforward improvement to Stata's -anova- would be to force it to ignore any subjects who are missing any repeated measures observations. That alone would be useful. Rob -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Airey, David C Sent: Monday, August 02, 2010 10:55 AM To: statalist@hsphsun2.harvard.edu Subject: st: re: RM ANOVA, was SPSS vs. Stata . > What SPSS still maintains over Stata is better ANOVA routines, > particularly Repeated-Measures fixed-factor designs. Stata treats RM > designs a bit strangely, I believe because it seems to "wrap" ANOVA code > around Regression methods. It's non-intuitive and can provide results > that aren't typical of RM ANOVA (consider how it uses full-n for > fixed-factor RM ANOVA without listwise elimination of subjects who are > missing an observation). I would much prefer to see Stata invest in > re-working their ANOVA code and analyses so that it is more consistant > with SAS or SPSS methodologies, offers more in terms of assumption > testing (ex. Sphericity tests), and is more intuitive. Michael Mitchell pointed this out in his head to head to head comparison of Stata, SPSS, and SAS some years ago in a report posted at ATS UCLA. I don't know if this is true anymore with version 11.1 of xtmixed and the margins functionality. This book shows use of xtmixed in designed experiments: <http://www-personal.umich.edu/~bwest/almmussp.html> BTW, you can test sphericity in Stata directly with the mvtest command or by asking for the univariate rm-anova corrections when you use the "repeated(varlist)" option to anova. Doesn't SPSS wrap GLM for its RM-ANOVA routines? Can you post an example of what you are talking about, re listwise elimination? I don't have SPSS. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: Re: RM ANOVA, was SPSS vs. Stata***From:*"Joseph Coveney" <jcoveney@bigplanet.com>

**References**:**st: re: RM ANOVA, was SPSS vs. Stata***From:*"Airey, David C" <david.airey@Vanderbilt.Edu>

- Prev by Date:
**Re: st: Copying Stata graphs/output to PowerPoint** - Next by Date:
**Re: st: RE: RE: Cut function** - Previous by thread:
**st: re: RM ANOVA, was SPSS vs. Stata** - Next by thread:
**st: Re: RM ANOVA, was SPSS vs. Stata** - Index(es):