[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steven Samuels <sjhsamuels@earthlink.net> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: Pearson chi square and Rao and Scott correction validity |

Date |
Fri, 7 Nov 2008 11:33:23 -0500 |

Ángel-

-Steve On Nov 7, 2008, at 8:35 AM, Ángel Rodríguez Laso wrote:

Reading back Korn and Graubard's (1999) Analysis of Health Surveys, Wiley, in page 78 they recommend for testing lack of association in a 2 x J contingency table a logistic regression with the presence/absence of the condition as the dependent variable, and for I x J contingency tables, a multinomial logit regression. I find this cumbersome. They consider chi-square statistics inappropriate in complex surveys, but they do not talk about the Rao and Scott correction.I will try to contact Korn or Graubard to see if they can add morelight.Many thanks, Angel Rodriguez-Laso 2008/11/7 Steven Samuels <samplerx@earthlink.net>:I see that in my previous post I confused two issues 1) the sample size requirements for validity of the survey-adjusted chi square tests in Stata; 2) sample size requirements for estimates of cell totals or proportions, with small counts. Ángel asked about the first issue. Bottom line: I really don't have an answer. -Steve On Nov 6, 2008, at 5:04 PM, Steven Samuels wrote:I've looked though Chapters 6-7 of Chamber's and Skinner's book Analysis of Survey Data, Wiley, 2003, but I have no definitive answer. I do have some thoughts: * "Expected" count is not a guide in the survey setting--it is a sum of weights of sample observations in the table cell. * The accuracy of the second-order Rao-Scott statistic chi square, probably the best test in -svy: tab-, is apt to depend on the number of clusters, on the crude counts, and on the distribution of the observations across clusters. The rule of thumb of 5 observations (or 1) in a cell is based on theory of i.i.d. observations that does not hold in the complex survey setting. * With a small number of events, I ordinarily display only unweighted numbers and do not reported weighted estimates or confidence intervals. When I have wanted to infer something about a proportion based on small outcome count, I've resorted to the methods on pp. 64-68 of Korn and Graubard (1999) Analysis of Health Surveys, Wiley. A quick Google search turned up one survey which would not report a cell with fewer than 25 observations (http://www.nsf.gov/statistics/ showsrvy.cfm?srvy_CatID=5&srvy_Seri=16) and another in which the minimum cell size was 4,000! (http://www.phac-aspc.gc.ca/publicat/ cdic-mcc/17-3/a_e.html). So a guess for Ángel is that not even five observations in table cell is enough. -Steve On Nov 6, 2008, at 7:33 AM, Nick Cox wrote:There is no need to invoke belief! My -tabchi- and -tabchii- (programs) from the -tab_chi- package on SSC do indeed give warnings. (There is no Stata program called tab-chi.) But these old warnings are very conservative. Many writers now advise that chi-square works fine so long as all expected frequencies are above about 1. In any case, the point can be explored by simulations or bootstrapping. Often it is better to use Fisher's exact test. I can't advise on the main issue, which is for svy-savvy people, but in general very low expected frequencies could be problematic for any method. Nick n.j.cox@durham.ac.uk Ángel Rodríguez Laso I've been reviewing the manuals and statalist archives and I'veconfirmed that Stata does not give any automatic warning messagewhenrequirements for a valid chi-square test are not met (i.e. no more than 20% of the expected values in a table are less than 5 and none are less than 1), what I think is a nuisance. I suppose this can be only worked out by writing the option 'expected' after tabulate andchecking oneself if the requirements are met. I believe Cox'stab-chipackage does give a warning when requirements are not met. I wonder also if the Rao and Scott correction of Pearson chi-square that is recommended for survey designs needs the same requirements. The problem then would be that -svy:tab- doesn't support the 'expected' option neither tab-chi is suitable for survey analysis. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/Steven Samuels 845-246-0774 18 Cantine's Island Saugerties, NY 12477 EFax: 208-498-7441 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: RE: Pearson chi square and Rao and Scott correction validity***From:*Steven Samuels <samplerx@earthlink.net>

**Re: st: RE: Pearson chi square and Rao and Scott correction validity***From:*"Ángel Rodríguez Laso" <angelrlaso@gmail.com>

- Prev by Date:
**Re: st: RE: extract t-values** - Next by Date:
**RE: RE: RE: RE: st: RE: extract t-values** - Previous by thread:
**Re: st: RE: Pearson chi square and Rao and Scott correction validity** - Next by thread:
**st: Double sample selection bias** - Index(es):

© Copyright 1996–2019 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |