[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Joseph Coveney" <jcoveney@bigplanet.com> |

To |
"Statalist" <statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Anova and Contrasts with missing cells |

Date |
Fri, 31 Oct 2008 00:57:53 +0900 |

Thomas J. Steichen wrote (excerpted and with a few replies interwoven): . . . both SAS and JMP interpreted a highly similar method of specifying the contrast as being what I intended. One can debate whether my intention was a reasonable intention but, assuming it was, Stata tested something else. -------------------------------------------------------------------------------- JC: I'm not sure what your SAS contrast statement looks like, but it's possible that you were just tripped up by a difference in syntax between SAS and Stata. -------------------------------------------------------------------------------- TJS: My question from my first post, "Which is right?", still stands. Maybe a better question is, What do the two contrasts, the one I used and the one Joe proposes, really say? -------------------------------------------------------------------------------- JC: I have more specifics below on what the two contrasts really say. In general, I find that it helps to examine the parameterization and then define the contrast. It makes it easier assure that you're testing the hypothesis that you intend to. As Bill Gould mentioned in the post you cited, both Stata and SAS allow you to see their parameterizations, and so you can formulate your contrast in view of the parameterization the package is using. -------------------------------------------------------------------------------- TJS: Interstingly, Joe use the phrase "correct contrast", but I think he really only meant the contrast that tests what appears to be what I intended. -------------------------------------------------------------------------------- JC: That's correct, er, right. I ought to have written more precisely; sorry. I do believe, though, that what I showed is the most reasonable contrast to make if you're interested in testing a difference between Round 1 and Round 3, and it seemed to be the one you intended. It amounts to interpolating (predicting, as Bill Gould wrote in the posts you cite) the Size-600 value for Round 3. The contrast that you specified tests the difference between the Round 1 / Size 600 cell mean and the Round 3 / Size 800 cell mean. -------------------------------------------------------------------------------- TJS: My actual point in quoting the above revolves around the choice of specifying the model as -anova nnn round size- versus -anova nn round size|round- and the resulting impact on testing. As Joe says, the ANOVA estimates are the same (well, he doesn't say exactly that, but I think that is what he means by "equivalent"), however, it appears there is no way to specify the contrast based on nested model symbolism. That is, I was unable to find a way to symbolicly specify anything about -size|round- using that notation in either -test- or -lincom-. Alternatively, after the nested model (or the crossed model), one can specify: test _b[round[1]] = _b[round[3]] + _b[size[1]*round[3]] / 2 Or mat test13 = (0, 1, 0, -1, 0, 0, -.5, 0, 0, 0, 0, 0) test, test(test13) And the test will be performed. The first of these clearly resorts to crossed-model notation (even in the nested setup!). My question: Is there a way to directly specify a nested term in -test- or -lincom- using nested notation (i.e., of the form: a|b)? If there is, I haven't found it. -------------------------------------------------------------------------------- JC: You're correct about what I meant by equivalence. I'm not sure, though, that I follow what you're saying about the contrast resorting to crossed-factors notation: _b[size[1]*round[3]] _is_ a nested-factor term. Nested factors are nothing special: -anova nnn round size|round- is essentially -anova nnn round size*round-, that is, an interaction term without including the nested factor among the main effects. And there's nothing really different between nested factors and crossed factors in using -test- or -lincom-: you can't to my knowledge use either a|b or a*b notation, per se, for testing simple effects or constructing custom contrasts of individual cell means. -------------------------------------------------------------------------------- TJS: Clearly, in this model, SAS and JMP test a contrast on -round- of the form (1, 0, -1, 0, 0) differently than Stata does. That implicitly implies that the packages assume different things to make it testable. In Stata notation, SAS and JMP test contrast matrix (0, 1,0,-1,0,0, -.5,0,0,0,0,0) while Stata tests (0, 1,0,-1,0,0, 0,0,0,0,0,0) (i.e., exactly what I specified!). It also implies that the interpretation of those tests depend on those assumptions. I don't know what to say about that other than to be cautious! -------------------------------------------------------------------------------- JC: Again, I don't know what you did for the contrast in SAS, but the difference might be just a syntax difference in the way that Stata and SAS have you specify the same contrast. Overall, Stata seems to me as at least as easy as SAS for forming custom contrasts after ANOVA or other estimation commands: in the SAS program below, although the ESTIMATE statement, itself, is simple enough in appearance, the documentation for it isn't. In addition, looking at the SAS output, it appears that SAS translated the ESTIMATE statement into "twice ROUND 1 versus (the equivalent of) twice ROUND 3", which seems odd and roundabout. (It's too voluminous to post here, but I can e-mail the SAS output for the code below privately.) I agree about having to be cautious. For example, apparently, you _must_ use the nested-factor specification in order for SAS to fit an ANOVA model properly with your dataset. If you fail to do this, you get junk: the ANOVA table from first PROC GLM below has 3 DF for ROUND + 1 DF for SIZE, which don't add up to the 5 DF stated for the Model.* I don't know what ROUNDs and SIZEs are, but from looking at the dataset they don't strike me as being naturally hierarchal or in a nested relationship (strictly one within the other), at least in the manner that Winer liked to illustrate,** and so I wouldn't have considered specifying SIZE(ROUND) off the top of my head. I'm guessing that SAS's so-called TYPE IV sums of squares would be no easier, at least for me, to work with with here. In general, I've found that setting up sensible contrasts of interest after fitting a cell-means model (Milliken & Johnson, cited last time) in Stata is relatively straightforward under these kinds of circumstances. Joseph Coveney * Is it common for SAS to do stuff like this? I don't recall ever having seen Stata blithely produce an otherwise normal-looking ANOVA table except that the degrees of freedom don't add up. Has anyone run across an example? ** B. J. Winer, D. R. Brown & K. M. Michels, _Statistical Principles in Experimental Design_ Third Edition. (New York: McGraw-Hill, 1991), pp. 358-65; 456-60; 502-4. DATA TJS (DROP = NNN_ADJM); INPUT ROUND SIZE NNN NNN_ADJM; CARDS; [dataset snipped--copy & paste from original post] ; RUN; PROC PRINT DATA = TJS; RUN; PROC GLM DATA = TJS; CLASS ROUND SIZE; MODEL NNN = ROUND SIZE / E3 SS3 SOLUTION; RUN; PROC GLM DATA = TJS; CLASS ROUND SIZE; MODEL NNN = ROUND SIZE(ROUND) / E3 SS3 SOLUTION; ESTIMATE 'ROUND 1 VERSUS ROUND 3' ROUND 1 0 -1 / E; RUN; clear * set more off input round size nnn nnn_adjm [dataset snipped--see original post] end drop nnn_adjm // Nonnested-factor specification anova nnn round size, class(round size) test round size, symbolic anova , regress lincom _b[round[1]] - _b[round[3]] - _b[size[1]] / 2 // Nested-factor specification anova nnn round size|round, class(round size) test round size|round, symbolic anova , regress lincom _b[round[1]] - _b[round[3]] - _b[size[1]*round[3]] / 2 // Alternative syntax for nested-factor specification anova nnn round size*round, class(round size) test round size*round, symbolic anova , regress lincom _b[round[1]] - _b[round[3]] - _b[size[1]*round[3]] / 2 // Cell means model generate cell = round * 1000 + size anova nnn cell, category(cell) test cell, symbolic anova , regress detail lincom _b[cell[1]] - ( _b[cell[3]] + _b[cell[4]] ) / 2 exit * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: Insheet whiel Ignoring Headers** - Next by Date:
**st: How to estate hettest after probit?** - Previous by thread:
**RE: st: Anova and Contrasts with missing cells** - Next by thread:
**st: Re: thanks** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |