Stas Kolenikov <skolenik@gmail.com>

statalist@hsphsun2.harvard.edu

Re: st: Chi2 test on weighted data

Tue, 25 Sep 2012 11:14:18 -0500

Educating the clients is a part of an applied industry statistician's burden. Sometimes, arguably, one of the most difficult parts: you can do numbers as accurately as you are able to, but if the client does not want to hear about the proper methodology, sometimes there's little you can do to convince them. -- -- Stas Kolenikov, PhD, PStat (SSC) :: http://stas.kolenikov.name -- Senior Survey Statistician, Abt SRBI :: work email kolenikovs at srbi dot com -- Opinions stated in this email are mine only, and do not reflect the position of my employer On Tue, Sep 25, 2012 at 11:07 AM, Steve Samuels <sjsamuels@gmail.com> wrote: > > Annelies replied to me privately that the situation is essentially what > Stas has surmised. She is well aware that the weighted chi-squre test is > suboptimal. But a study protocol, written by others, prescribed it. We > can only hope that she can convince the collaborators to accept the > better test. > > The major argument might be: they cannot trust a nominally "significant" > chi-square p-value, because the better p-value will > be larger. > > > Steve > > > On Sep 22, 2012, at 2:36 PM, Stas Kolenikov wrote: > > On Fri, Sep 21, 2012 at 3:46 PM, Steve Samuels <sjsamuels@gmail.com> wrote: >> >> >> Let me make this clear: the "uncorrected" chi square is the ordinary chi >> square statistic, but with weighted cell proportions in stead of raw >> proportions. Details are given in the manual. >> >> If you used the uncorrected chi square statistic produced in your >> example, you would have P = 0.11, compared to the more accurate P = >> 0.19. So now you have me curious: Why does this project "need" a test >> whose p-value is known to be bad? > > I imagine that Annelies' client may only know the name "chi-square > test for contingency tables". If they knew this as an "independence > test", and knew that the chi-square was an asymptotic approximation, > they probably would not have cared about what distribution it is > related to in the software, as long as this is the standard practice > in the field (and what Stata offers here indeed is). > > Annelies would need to read upon Rao-Scott corrections: > http://www.citeulike.org/user/ctacmo/article/1036968, > http://www.citeulike.org/user/ctacmo/article/1449501, > http://www.citeulike.org/user/ctacmo/article/8922395. I am sure these > references are also in the [SVY] manual. > > -- > -- Stas Kolenikov, PhD, PStat (SSC) :: http://stas.kolenikov.name > -- Senior Survey Statistician, Abt SRBI :: work email kolenikovs at > srbi dot com > -- Opinions stated in this email are mine only, and do not reflect the > position of my employer > > >> >> >> Actually, -svy: tab- also shows the uncorrected, weighted, Pearson chi square statistic. It's not appropriate for doing a "chi square test", but there it is. >> >> Steve >> >> >> The Design-based F produced by -svy tab- _is_ a corrected weighted Pearson chi square statistic. But because of the complex sampling design, the distribution of the uncorrected version is not chi square. To get a valid p-value, the chi square statistic is converted to an F statistic. For details and references, see the manual entry for "svy: tabulate twoway". >> >> Steve >> >> On Sep 20, 2012, at 8:38 AM, Dr. Annelies Blom wrote: >> >> Dear Steve, dear all, >> >> Thank you very much for your answer. I was aware of the svy commands. However, >> the command does not support the chi2 option. When I estimate the table without >> the chi2 option, I do get a chi2 estimate, however, according to the output this >> estimate is "uncorrected". I assume that this means that the weightt is not >> taken into account, right? >> Stata does calculate a " Design-based F", however, for this project I need the >> chi2. >> >> The output for the tests looks as follows: >> Pearson: >> Uncorrected chi2(4) = 7.5233 >> Design-based F(3.97, 18286.26)= 1.5275 P = 0.1916 >> >> Does anyone know whether I am just misinterpreting the output or how to get >> Stata to deliver weighted chi2 estimates? >> >> Best, >> Annelies >> >> >> Date: Mon, 10 Sep 2012 17:32:27 -0400 >> From: Steve Samuels <sjsamuels@gmail.com> >> Subject: Re: st: Chi2 test on weighted data >> >> Hello, Annelies. Welcome to Statalist! >> >> The command you are seeking is "svy tabulate", and you might have found >> it by typing "help survey tabulate". For valid test results, you must first >> specify the survey design with -svyset-. Typing "help survey" will introduce >> Stata's survey capabilities. There are a number of contributed survey commands, >> so ask >> if you need functionality that the built-in commands do not provide. >> >> Steve >> >> >> On Sep 10, 2012, at 2:00 PM, Prof. Annelies Blom wrote: >> >> Dear all, >> >> I have a quick question which I just don't seem to be able to solve. >> >> I would like to perform a chi2 test on whether two categorical variable are >> related. My data are survey data and contain a design weight. Thus, I gathered >> that I should use pweights. However, I cannot find any command that lets me >> perform a chi2 test on pweighted data. >> >> For example: tabulate does not allow the pweight option and, moreover, does only >> allow frequency weights in combination with the chi2 analysis. >> >> What am I missing? >> >> Best, >> Annelies >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

