[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Pearson chi square and Rao and Scott correction validity

From	Steven Samuels <[email protected]>
To	[email protected]
Subject	Re: st: RE: Pearson chi square and Rao and Scott correction validity
Date	Thu, 6 Nov 2008 17:04:00 -0500

I've looked though Chapters 6-7 of Chamber's and Skinner's bookAnalysis of Survey Data, Wiley, 2003, but I have no definitiveanswer. I do have some thoughts:

* "Expected" count is not a guide in the survey setting--it is a sumof weights of sample observations in the table cell.

* The accuracy of the second-order Rao-Scott statistic chi square,probably the best test in -svy: tab-, is apt to depend on the numberof clusters, on the crude counts, and on the distribution of theobservations across clusters. The rule of thumb of 5 observations (or1) in a cell is based on theory of i.i.d. observations that does nothold in the complex survey setting.

* With a small number of events, I ordinarily display only unweightednumbers and do not reported weighted estimates or confidenceintervals. When I have wanted to infer something about a proportionbased on small outcome count, I've resorted to the methods on pp.64-68 of Korn and Graubard (1999) Analysis of Health Surveys, Wiley.

A quick Google search turned up one survey which would not report acell with fewer than 25 observations (http://www.nsf.gov/statistics/showsrvy.cfm?srvy_CatID=5&srvy_Seri=16) and another in which theminimum cell size was 4,000! (http://www.phac-aspc.gc.ca/publicat/cdic-mcc/17-3/a_e.html).

So a guess for Ángel is that not even five observations in table cellis enough.


-Steve

On Nov 6, 2008, at 7:33 AM, Nick Cox wrote:

There is no need to invoke belief! My -tabchi- and -tabchii-(programs) from the -tab_chi- package on SSC do indeed givewarnings. (There is no Stata program called tab-chi.)
But these old warnings are very conservative. Many writers nowadvise that chi-square works fine so long as all expectedfrequencies are above about 1. In any case, the point can beexplored by simulations or bootstrapping. Often it is better to useFisher's exact test.
I can't advise on the main issue, which is for svy-savvy people,but in general very low expected frequencies could be problematicfor any method.
Nick
[email protected]

Ángel Rodríguez Laso


I've been reviewing the manuals and statalist archives and I've
confirmed that Stata does not give any automatic warning message when
requirements for a valid chi-square test are not met (i.e. no more
than 20% of the expected values in a table are less than 5 and none
are less than 1), what I think is a nuisance. I suppose this can be
only worked out by writing the option 'expected' after tabulate and
checking oneself if the requirements are met. I believe Cox's tab-chi
package does give a warning when requirements are not met.

I wonder also if the Rao and Scott correction of Pearson chi-square
that is recommended for survey designs needs the same requirements.
The problem then would be that -svy:tab- doesn't support the
'expected' option neither tab-chi is suitable for survey analysis.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Pearson chi square and Rao and Scott correction validity
  - From: "Ángel Rodríguez Laso" <[email protected]>
- st: RE: Pearson chi square and Rao and Scott correction validity
  - From: "Nick Cox" <[email protected]>

Prev by Date: st: Running do files from other folders?
Next by Date: st: RE: Running do files from other folders?
Previous by thread: st: RE: Pearson chi square and Rao and Scott correction validity
Next by thread: st: xtnbreg and stepwise regression
Index(es):
- Date
- Thread