Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Pearson chi square and Rao and Scott correction validity


From   "Ángel Rodríguez Laso" <angelrlaso@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: Pearson chi square and Rao and Scott correction validity
Date   Fri, 7 Nov 2008 14:35:30 +0100

Reading back Korn and Graubard's (1999) Analysis of Health Surveys,
Wiley, in page 78 they recommend for testing lack of association in a
2 x J contingency table a logistic regression with the
presence/absence of the condition as the dependent variable, and for I
x J contingency tables, a multinomial logit regression. I find this
cumbersome. They consider chi-square statistics inappropriate in
complex surveys, but they do not talk about the Rao and Scott
correction.

I will try to contact Korn or Graubard to see if they can add more light.

Many thanks,

Angel Rodriguez-Laso


2008/11/7 Steven Samuels <samplerx@earthlink.net>:
> I see that in my previous post I confused two issues 1) the sample
> size requirements for validity of the survey-adjusted chi square
> tests in Stata; 2) sample size requirements for estimates of cell
> totals or proportions, with small counts.  Ángel asked about the
> first issue. Bottom line: I really don't have an answer.
>
> -Steve
>
> On Nov 6, 2008, at 5:04 PM, Steven Samuels wrote:
>
>>
>> I've looked though Chapters 6-7 of Chamber's and Skinner's book
>> Analysis of Survey Data, Wiley, 2003, but I have no definitive
>> answer. I do have some thoughts:
>>
>> * "Expected" count is not a guide in the survey setting--it is a
>> sum of weights of sample observations in the table cell.
>>
>> * The accuracy of the second-order Rao-Scott statistic chi square,
>> probably the best test in -svy: tab-, is apt to depend on the
>> number of clusters, on the crude counts, and on the distribution of
>> the observations across clusters. The rule of thumb of 5
>> observations (or 1) in a cell is based on theory of  i.i.d.
>> observations that does not hold in the complex survey setting.
>>
>> * With a small number of events, I ordinarily display only
>> unweighted numbers and do not reported weighted estimates or
>> confidence intervals. When I have wanted to infer something about a
>> proportion based on small outcome count, I've resorted to the
>> methods on pp. 64-68 of Korn and Graubard (1999) Analysis of Health
>> Surveys, Wiley.
>>
>> A quick Google search turned up one survey which would not report a
>> cell with fewer than 25 observations (http://www.nsf.gov/statistics/
>> showsrvy.cfm?srvy_CatID=5&srvy_Seri=16) and another in which the
>> minimum cell size was 4,000! (http://www.phac-aspc.gc.ca/publicat/
>> cdic-mcc/17-3/a_e.html).
>>
>> So a guess for Ángel is that not even five observations in table
>> cell is enough.
>>
>> -Steve
>>
>> On Nov 6, 2008, at 7:33 AM, Nick Cox wrote:
>>
>>> There is no need to invoke belief! My -tabchi- and -tabchii-
>>> (programs) from the -tab_chi- package on SSC do indeed give
>>> warnings. (There is no Stata program called tab-chi.)
>>>
>>> But these old warnings are very conservative. Many writers now
>>> advise that chi-square works fine so long as all expected
>>> frequencies are above about 1. In any case, the point can be
>>> explored by simulations or bootstrapping. Often it is better to
>>> use Fisher's exact test.
>>>
>>> I can't advise on the main issue, which is for svy-savvy people,
>>> but in general very low expected frequencies could be problematic
>>> for any method.
>>>
>>> Nick
>>> n.j.cox@durham.ac.uk
>>>
>>> Ángel Rodríguez Laso
>>>
>>>
>>> I've been reviewing the manuals and statalist archives and I've
>>> confirmed that Stata does not give any automatic warning message when
>>> requirements for a valid chi-square test are not met (i.e. no more
>>> than 20% of the expected values in a table are less than 5 and none
>>> are less than 1), what I think is a nuisance. I suppose this can be
>>> only worked out by writing the option 'expected' after tabulate and
>>> checking oneself if the requirements are met. I believe Cox's tab-chi
>>> package does give a warning when requirements are not met.
>>>
>>> I wonder also if the Rao and Scott correction of Pearson chi-square
>>> that is recommended for survey designs needs the same requirements.
>>> The problem then would be that -svy:tab- doesn't support the
>>> 'expected' option neither tab-chi is suitable for survey analysis.
>>>
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
> Steven Samuels
> 845-246-0774
> 18 Cantine's Island
> Saugerties, NY 12477
> EFax: 208-498-7441
>
>
>
>
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index