Steven Samuels <samplerx@earthlink.net>

statalist@hsphsun2.harvard.edu |

st: RE: Pearson chi square and Rao and Scott correction validity

Fri, 7 Nov 2008 07:03:54 -0500

I see that in my previous post I confused two issues 1) the sample size requirements for validity of the survey-adjusted chi square tests in Stata; 2) sample size requirements for estimates of cell totals or proportions, with small counts. Ángel asked about the first issue. Bottom line: I really don't have an answer. -Steve On Nov 6, 2008, at 5:04 PM, Steven Samuels wrote: > > I've looked though Chapters 6-7 of Chamber's and Skinner's book > Analysis of Survey Data, Wiley, 2003, but I have no definitive > answer. I do have some thoughts: > > * "Expected" count is not a guide in the survey setting--it is a > sum of weights of sample observations in the table cell. > > * The accuracy of the second-order Rao-Scott statistic chi square, > probably the best test in -svy: tab-, is apt to depend on the > number of clusters, on the crude counts, and on the distribution of > the observations across clusters. The rule of thumb of 5 > observations (or 1) in a cell is based on theory of i.i.d. > observations that does not hold in the complex survey setting. > > * With a small number of events, I ordinarily display only > unweighted numbers and do not reported weighted estimates or > confidence intervals. When I have wanted to infer something about a > proportion based on small outcome count, I've resorted to the > methods on pp. 64-68 of Korn and Graubard (1999) Analysis of Health > Surveys, Wiley. > > A quick Google search turned up one survey which would not report a > cell with fewer than 25 observations (http://www.nsf.gov/statistics/ > showsrvy.cfm?srvy_CatID=5&srvy_Seri=16) and another in which the > minimum cell size was 4,000! (http://www.phac-aspc.gc.ca/publicat/ > cdic-mcc/17-3/a_e.html). > > So a guess for Ángel is that not even five observations in table > cell is enough. > > -Steve > > On Nov 6, 2008, at 7:33 AM, Nick Cox wrote: > >> There is no need to invoke belief! My -tabchi- and -tabchii- >> (programs) from the -tab_chi- package on SSC do indeed give >> warnings. (There is no Stata program called tab-chi.) >> >> But these old warnings are very conservative. Many writers now >> advise that chi-square works fine so long as all expected >> frequencies are above about 1. In any case, the point can be >> explored by simulations or bootstrapping. Often it is better to >> use Fisher's exact test. >> >> I can't advise on the main issue, which is for svy-savvy people, >> but in general very low expected frequencies could be problematic >> for any method. >> >> Nick >> n.j.cox@durham.ac.uk >> >> Ángel Rodríguez Laso >> >> >> I've been reviewing the manuals and statalist archives and I've >> confirmed that Stata does not give any automatic warning message when >> requirements for a valid chi-square test are not met (i.e. no more >> than 20% of the expected values in a table are less than 5 and none >> are less than 1), what I think is a nuisance. I suppose this can be >> only worked out by writing the option 'expected' after tabulate and >> checking oneself if the requirements are met. I believe Cox's tab-chi >> package does give a warning when requirements are not met. >> >> I wonder also if the Rao and Scott correction of Pearson chi-square >> that is recommended for survey designs needs the same requirements. >> The problem then would be that -svy:tab- doesn't support the >> 'expected' option neither tab-chi is suitable for survey analysis. >> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > Steven Samuels 845-246-0774 18 Cantine's Island Saugerties, NY 12477 EFax: 208-498-7441 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

