Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Chi square test unavailable when subpop is used in svy analyisis


From   jpitblado@stata.com (Jeff Pitblado, StataCorp LP)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Chi square test unavailable when subpop is used in svy analyisis
Date   Mon, 18 Aug 2008 12:25:51 -0500

Ángel Rodríguez Laso angelrlaso@gmail.com> is using -svy: tabulate- with a
-subpop()- option that conditions out a row of the two-way table.  This causes
a zero in the corresponding row margin, preventing the computation of certain
test statistics.  The question is how to get Stata to produce a Pearson
statistic in this case.

> I'm working with Stata 9.2.
> 
> I'm interested in obtaining the corrected chi square test for a
> distribution of two variables from a survey ('p108_n' and 'sexo'), but
> limiting the analysis to a selected group of values  of p108_n (1 to
> 5). I've used the subpop command with the following result:
> 
> 
>  svy, subpop(if p108_n<10):tab p108_n sexo , count nolabel format(%11.1f)
> (running tabulate on estimation sample)
> 
> Number of strata   =        11                  Number of obs      =     12190
> Number of PSUs     =      1266                  Population size    = 12189,962
>                                                 Subpop. no. of obs =     11733
>                                                 Subpop. size       = 11834,102
>                                                 Design df          =      1255
> 
> ----------------------------------
>           |          sexo
>    p108_n |      1       2   Total
> ----------+-----------------------
>         1 |  638,8   708,1  1346,9
>         2 |  581,3   726,2  1307,5
>         3 | 1968,3  2144,6  4112,9
>         4 | 2215,6  1926,3  4141,9
>         5 |  404,2   520,7   924,9
>        10 |    0,0     0,0     0,0
>           |
>     Total | 5808,2  6025,9  11834,1
> ----------------------------------
>   Key:  weighted counts
> 
>   Table contains a zero in the marginals.
>   Statistics cannot be computed.
> 
> 
> Is there any way to get the chi square test I need without deleting
> the p108_n==10 individuals, I mean, keeping them for the calculation
> of the standard errors?

Ángel can use the -se- option to see how changing the -subpop()- option into
an -if- condition will affect the standard error estimates (SEs).   Using the
above example, Ángel can run the following two commands and compare the
resulting SEs.

	. svy, subpop(if p108_n<10) : tab p108_n sexo , count nolabel

	. svy if p108_n<10 : tab p108_n sexo , count nolabel

We suspect that the reported SE values will be very similar.  In that case, we
would propose that the Pearson statistic reported by the second command is
reasonable.

In general, we strongly encourage survey data analysts to use -subpop()-
instead of the -if- in order to obtain the proper subpop SEs; however, this is
one case where using -if- should result in essentially similar SEs without
preventing the computation of a reasonable test statistic.

	Cautionary note:  The smaller the subpop sample size relative to the
	overall sample size, the more likely the SE values will differ between
	the two methods.

--Jeff
jpitblado@stata.com
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index