[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Pearson chi square and Rao and Scott correction validity

From   Steven Samuels <>
Subject   st: RE: Pearson chi square and Rao and Scott correction validity
Date   Fri, 7 Nov 2008 07:03:54 -0500

I see that in my previous post I confused two issues 1) the sample  
size requirements for validity of the survey-adjusted chi square  
tests in Stata; 2) sample size requirements for estimates of cell  
totals or proportions, with small counts.  Ángel asked about the  
first issue. Bottom line: I really don't have an answer.


On Nov 6, 2008, at 5:04 PM, Steven Samuels wrote:

> I've looked though Chapters 6-7 of Chamber's and Skinner's book  
> Analysis of Survey Data, Wiley, 2003, but I have no definitive  
> answer. I do have some thoughts:
> * "Expected" count is not a guide in the survey setting--it is a  
> sum of weights of sample observations in the table cell.
> * The accuracy of the second-order Rao-Scott statistic chi square,  
> probably the best test in -svy: tab-, is apt to depend on the  
> number of clusters, on the crude counts, and on the distribution of  
> the observations across clusters. The rule of thumb of 5  
> observations (or 1) in a cell is based on theory of  i.i.d.  
> observations that does not hold in the complex survey setting.
> * With a small number of events, I ordinarily display only  
> unweighted numbers and do not reported weighted estimates or  
> confidence intervals. When I have wanted to infer something about a  
> proportion based on small outcome count, I've resorted to the  
> methods on pp. 64-68 of Korn and Graubard (1999) Analysis of Health  
> Surveys, Wiley.
> A quick Google search turned up one survey which would not report a  
> cell with fewer than 25 observations ( 
> showsrvy.cfm?srvy_CatID=5&srvy_Seri=16) and another in which the  
> minimum cell size was 4,000! ( 
> cdic-mcc/17-3/a_e.html).
> So a guess for Ángel is that not even five observations in table  
> cell is enough.
> -Steve
> On Nov 6, 2008, at 7:33 AM, Nick Cox wrote:
>> There is no need to invoke belief! My -tabchi- and -tabchii-  
>> (programs) from the -tab_chi- package on SSC do indeed give  
>> warnings. (There is no Stata program called tab-chi.)
>> But these old warnings are very conservative. Many writers now  
>> advise that chi-square works fine so long as all expected  
>> frequencies are above about 1. In any case, the point can be  
>> explored by simulations or bootstrapping. Often it is better to  
>> use Fisher's exact test.
>> I can't advise on the main issue, which is for svy-savvy people,  
>> but in general very low expected frequencies could be problematic  
>> for any method.
>> Nick
>> Ángel Rodríguez Laso
>> I've been reviewing the manuals and statalist archives and I've
>> confirmed that Stata does not give any automatic warning message when
>> requirements for a valid chi-square test are not met (i.e. no more
>> than 20% of the expected values in a table are less than 5 and none
>> are less than 1), what I think is a nuisance. I suppose this can be
>> only worked out by writing the option 'expected' after tabulate and
>> checking oneself if the requirements are met. I believe Cox's tab-chi
>> package does give a warning when requirements are not met.
>> I wonder also if the Rao and Scott correction of Pearson chi-square
>> that is recommended for survey designs needs the same requirements.
>> The problem then would be that -svy:tab- doesn't support the
>> 'expected' option neither tab-chi is suitable for survey analysis.
>> *
>> *   For searches and help try:
>> *
>> *
>> *

Steven Samuels
18 Cantine's Island
Saugerties, NY 12477
EFax: 208-498-7441

*   For searches and help try:

© Copyright 1996–2022 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index