[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Another big difference between the overparameterized and the cell means approach is the size of the underlying design matrix (the X'X matrix). In a cell means approach the X'X matrix is smaller (often much smaller) and of full rank -- no columns/rows need to be dropped. In the overparameterized model the X'X matrix has redundancies built in that end up getting dropped out. That is why I was commenting on comparing the degrees of freedom versus the number of columns used in the X'X matrix for that particular example. The D*C*B*G|A term has 8 d.f.s but used 72 columns in the X'X matrix (all but 8 of which end up getting dropped due to collinearity with other terms in the model). Consider an anova with factors A (with 3 levels) and B (with 4 levels) | B | 1 2 3 4 ------+----------- 1 | 1 2 3 4 A 2 | 5 6 7 8 3 | 9 10 11 12 There are 12 cells in this layout. The overparameterized model that most people are familiar with would be run by typing (assuming y is the dependent variable): . anova y A B A*B The design matrix and d.f.s would be term # of cols in X'X df ------------------------------ _cons 1 1 A 3 2 B 4 3 A*B 12 6 ------------------------------ total 20 12 There are 8 (= 20-12) columns/rows dropped due to collinearity. The cell means ANOVA approach is . tab A B, gen(cells) . anova y cells, noconstant This is just a oneway anova on the 12 cells that make up A and B. The F-test for A B and A*B are not automatically provided, but can be obtained using -test- with the -accum- option. Individual degree-of-freedom tests, however, are easy to think about and form. >> With your particular case it doesn't look like you can get a >> S|A*B term (I am assuming A is crossed with B). You say A has 20 >> levels and B has 2 and that there are 400 animals total. Since >> 20*2 = 400, I guess that means you have one animal per a A*B >> combination. So you will not be able to estimate a S|A*B term >> separate from the A*B term. Maybe you will drop the A*B term >> (and assume that the A*B interaction is insignificant). > > Factor A is isogenic strain (all animals genetically the same within > strain like twins or clones, but animals different between the 20 > strains). Factor B is sex. I have 10 animals per sex, both sexes per > strain, so I should be able to get the term S|A*B, since I have 10 > animals per A*B combination. 20*2 = 40 A*B levels, I have 400 animals, > so 10 per combination. Oops. In my message I said "20*2 = 400" -- duh! You are fine -- as you say, you have 10 animals per A*B combo -- i.e., 20*2*10 = 400. > Factors C, D, E, F, are drug treatment, test session period, stimulus > character 1, stimulus character 2. > >> I commend the idea of creating an example dataset and doing a dry >> run of your analysis before collecting the data. This is helpful >> in complicated designs to help point out limitations or problems >> you might run into. In some cases it might set you back to >> rethinking how you want to design your experiment. > > In my case I'm fairly limited in being able to obtain 10 animals per > sex per strain. Too expensive otherwise. So a within subject design > seems necessary in some fashion. The only real concern I had, carry > over effects of drug level (saline<->drugA<->drugB) were not a problem > in another paper where order was counterbalanced by animal and a rest > period between the three drug level test days was given. Of course, I > don't claim to know it is the best design. But I do think dividing up > the limited number of animals into a between group design will lack > power. You are probably doing very well with your design. I was just pointing out in general that running a proposed analysis on contrived data can help point out unforseen problems. I am reminded of my job as a graduate student providing statistical consulting for graduate students in other scientific fields who were working on their dissertation or thesis. I always felt very bad telling someone that they had spent a lot of time (and possibly money) gathering data that wouldn't answer the research question they had posed (usually due to confounding). If they would have popped in for a consultation (usually provided for free by agreement between the different University departments) before gathering their data, they would have saved themselves a lot of time and headaches (and possibly graduated earlier). Ken Higbee khigbee@stata.com StataCorp 1-800-STATAPC * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: Re: st: matsize** - Next by Date:
**st: RE: RE: Query: Adding yline()** - Previous by thread:
**st: Query: Adding yline()** - Next by thread:
**st: RE: RE: Query: Adding yline()** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |