[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"K Jensen" <k.x.jensen@googlemail.com> |

To |
statalist <statalist@hsphsun2.harvard.edu> |

Subject |
st: Non-standard categorical data test - help! |

Date |
Wed, 10 Dec 2008 12:09:00 +0000 |

I have data from five different populations that looks like this:- +--------------------+ | pop a b n | |--------------------| 1. | 1 0 0 325 | 2. | 1 0 1 77 | 3. | 1 1 0 59 | 4. | 1 1 1 9 | 5. | 2 0 0 788 | |--------------------| 6. | 2 0 1 262 | 7. | 2 1 0 99 | 8. | 2 1 1 28 | 9. | 3 0 0 270 | 10. | 3 0 1 91 | |--------------------| 11. | 3 1 0 40 | 12. | 3 1 1 6 | 13. | 4 0 0 311 | 14. | 4 0 1 84 | 15. | 4 1 0 35 | |--------------------| 16. | 4 1 1 9 | 17. | 5 0 0 281 | 18. | 5 0 1 85 | 19. | 5 1 0 28 | 20. | 5 1 1 5 | |--------------------| where each population has counts of the # of observations (n) in of the four categories created by the possible values of two factors A and B. I would like to test the a priori hypothesis that there should be fewer than expected observations with both A=1 and B=1 than if A and B were independent, ie :- Ho: p(A=1,B=1)_i = p(A=1)_i * p(B=1)_i for all i, i=1,5 versus H1: p(A=1,B=1)_i < p(A=1)_i * p(B=1)_i ... to get a single p-value I don't think you can aggregate this into one big chi-square test with the observations: 1975 261 599 57 because there is no physical reason to expect p(A) and p(B) to be the same for all i, and indeed it looks as though they are different Also, is that a more general test of non-independence rather than specifically looking at directional departure at A=1 and B=1? I tried doing this as what I have seen described as a "Replicated G-test of independence" (I had to do this in Excel - if anyone knows how to do it in Stata, that would be really useful for future projects) and got the following results:- Pop G df p 1 1.46 1 0.2271 2 0.53 1 0.4681 3 3.74 1 0.0533 4 0.02 1 0.9002 5 1.23 1 0.2679 Total G 6.96 5 0.2233 Pooled G 4.84 1 0.0278 Heterogeneity G 2.12 4 0.0000 i.e. the heterogeneity G suggests that you can't pool the results like this Is there any way in Stata to test my hypothesis and get a single p-value? Thankyou Karin * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Non-standard categorical data test - help!***From:*Ronan Conroy <rconroy@rcsi.ie>

- Prev by Date:
**Re: st: Date format export to excel** - Next by Date:
**Re: st: Date format export to excel** - Previous by thread:
**st: Date format export to excel** - Next by thread:
**Re: st: Non-standard categorical data test - help!** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |