[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Ángel Rodríguez Laso" <angelrlaso@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: Pearson chi square and Rao and Scott correction validity |

Date |
Fri, 7 Nov 2008 14:35:30 +0100 |

Reading back Korn and Graubard's (1999) Analysis of Health Surveys, Wiley, in page 78 they recommend for testing lack of association in a 2 x J contingency table a logistic regression with the presence/absence of the condition as the dependent variable, and for I x J contingency tables, a multinomial logit regression. I find this cumbersome. They consider chi-square statistics inappropriate in complex surveys, but they do not talk about the Rao and Scott correction. I will try to contact Korn or Graubard to see if they can add more light. Many thanks, Angel Rodriguez-Laso 2008/11/7 Steven Samuels <samplerx@earthlink.net>: > I see that in my previous post I confused two issues 1) the sample > size requirements for validity of the survey-adjusted chi square > tests in Stata; 2) sample size requirements for estimates of cell > totals or proportions, with small counts. Ángel asked about the > first issue. Bottom line: I really don't have an answer. > > -Steve > > On Nov 6, 2008, at 5:04 PM, Steven Samuels wrote: > >> >> I've looked though Chapters 6-7 of Chamber's and Skinner's book >> Analysis of Survey Data, Wiley, 2003, but I have no definitive >> answer. I do have some thoughts: >> >> * "Expected" count is not a guide in the survey setting--it is a >> sum of weights of sample observations in the table cell. >> >> * The accuracy of the second-order Rao-Scott statistic chi square, >> probably the best test in -svy: tab-, is apt to depend on the >> number of clusters, on the crude counts, and on the distribution of >> the observations across clusters. The rule of thumb of 5 >> observations (or 1) in a cell is based on theory of i.i.d. >> observations that does not hold in the complex survey setting. >> >> * With a small number of events, I ordinarily display only >> unweighted numbers and do not reported weighted estimates or >> confidence intervals. When I have wanted to infer something about a >> proportion based on small outcome count, I've resorted to the >> methods on pp. 64-68 of Korn and Graubard (1999) Analysis of Health >> Surveys, Wiley. >> >> A quick Google search turned up one survey which would not report a >> cell with fewer than 25 observations (http://www.nsf.gov/statistics/ >> showsrvy.cfm?srvy_CatID=5&srvy_Seri=16) and another in which the >> minimum cell size was 4,000! (http://www.phac-aspc.gc.ca/publicat/ >> cdic-mcc/17-3/a_e.html). >> >> So a guess for Ángel is that not even five observations in table >> cell is enough. >> >> -Steve >> >> On Nov 6, 2008, at 7:33 AM, Nick Cox wrote: >> >>> There is no need to invoke belief! My -tabchi- and -tabchii- >>> (programs) from the -tab_chi- package on SSC do indeed give >>> warnings. (There is no Stata program called tab-chi.) >>> >>> But these old warnings are very conservative. Many writers now >>> advise that chi-square works fine so long as all expected >>> frequencies are above about 1. In any case, the point can be >>> explored by simulations or bootstrapping. Often it is better to >>> use Fisher's exact test. >>> >>> I can't advise on the main issue, which is for svy-savvy people, >>> but in general very low expected frequencies could be problematic >>> for any method. >>> >>> Nick >>> n.j.cox@durham.ac.uk >>> >>> Ángel Rodríguez Laso >>> >>> >>> I've been reviewing the manuals and statalist archives and I've >>> confirmed that Stata does not give any automatic warning message when >>> requirements for a valid chi-square test are not met (i.e. no more >>> than 20% of the expected values in a table are less than 5 and none >>> are less than 1), what I think is a nuisance. I suppose this can be >>> only worked out by writing the option 'expected' after tabulate and >>> checking oneself if the requirements are met. I believe Cox's tab-chi >>> package does give a warning when requirements are not met. >>> >>> I wonder also if the Rao and Scott correction of Pearson chi-square >>> that is recommended for survey designs needs the same requirements. >>> The problem then would be that -svy:tab- doesn't support the >>> 'expected' option neither tab-chi is suitable for survey analysis. >>> >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >> > > Steven Samuels > 845-246-0774 > 18 Cantine's Island > Saugerties, NY 12477 > EFax: 208-498-7441 > > > > > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: RE: Pearson chi square and Rao and Scott correction validity***From:*Steven Samuels <sjhsamuels@earthlink.net>

**References**:**st: RE: Pearson chi square and Rao and Scott correction validity***From:*Steven Samuels <samplerx@earthlink.net>

- Prev by Date:
**st: Re: Suggestions On Event Study Implementation Using Sureg** - Next by Date:
**RE: st: RE: extract t-values** - Previous by thread:
**st: RE: Pearson chi square and Rao and Scott correction validity** - Next by thread:
**Re: st: RE: Pearson chi square and Rao and Scott correction validity** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |