[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
jpitblado@stata.com (Jeff Pitblado, StataCorp LP) |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Chi square test unavailable when subpop is used in svy analyisis |

Date |
Mon, 18 Aug 2008 12:25:51 -0500 |

Ángel Rodríguez Laso angelrlaso@gmail.com> is using -svy: tabulate- with a -subpop()- option that conditions out a row of the two-way table. This causes a zero in the corresponding row margin, preventing the computation of certain test statistics. The question is how to get Stata to produce a Pearson statistic in this case. > I'm working with Stata 9.2. > > I'm interested in obtaining the corrected chi square test for a > distribution of two variables from a survey ('p108_n' and 'sexo'), but > limiting the analysis to a selected group of values of p108_n (1 to > 5). I've used the subpop command with the following result: > > > svy, subpop(if p108_n<10):tab p108_n sexo , count nolabel format(%11.1f) > (running tabulate on estimation sample) > > Number of strata = 11 Number of obs = 12190 > Number of PSUs = 1266 Population size = 12189,962 > Subpop. no. of obs = 11733 > Subpop. size = 11834,102 > Design df = 1255 > > ---------------------------------- > | sexo > p108_n | 1 2 Total > ----------+----------------------- > 1 | 638,8 708,1 1346,9 > 2 | 581,3 726,2 1307,5 > 3 | 1968,3 2144,6 4112,9 > 4 | 2215,6 1926,3 4141,9 > 5 | 404,2 520,7 924,9 > 10 | 0,0 0,0 0,0 > | > Total | 5808,2 6025,9 11834,1 > ---------------------------------- > Key: weighted counts > > Table contains a zero in the marginals. > Statistics cannot be computed. > > > Is there any way to get the chi square test I need without deleting > the p108_n==10 individuals, I mean, keeping them for the calculation > of the standard errors? Ángel can use the -se- option to see how changing the -subpop()- option into an -if- condition will affect the standard error estimates (SEs). Using the above example, Ángel can run the following two commands and compare the resulting SEs. . svy, subpop(if p108_n<10) : tab p108_n sexo , count nolabel . svy if p108_n<10 : tab p108_n sexo , count nolabel We suspect that the reported SE values will be very similar. In that case, we would propose that the Pearson statistic reported by the second command is reasonable. In general, we strongly encourage survey data analysts to use -subpop()- instead of the -if- in order to obtain the proper subpop SEs; however, this is one case where using -if- should result in essentially similar SEs without preventing the computation of a reasonable test statistic. Cautionary note: The smaller the subpop sample size relative to the overall sample size, the more likely the SE values will differ between the two methods. --Jeff jpitblado@stata.com * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Chi square test unavailable when subpop is used in svy analyisis***From:*"Ángel Rodríguez Laso" <angelrlaso@gmail.com>

- Prev by Date:
**Re: st: stata code for two-part model** - Next by Date:
**st: RE: box plot with 2nd variable** - Previous by thread:
**st: Chi square test unavailable when subpop is used in svy analyisis** - Next by thread:
**Re: st: Chi square test unavailable when subpop is used in svy analyisis** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |