[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Austin Nichols" <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: testing equality of means for survey data |

Date |
Fri, 23 Nov 2007 18:23:33 -0500 |

DEEPANKAR BASU <basu.15@osu.edu>: Quite wrong, I'm afraid: Your variance calculation is incorrect, and it looks like your subpop option does not restrict to boys (or girls), but to children living in households with any boys (or any girls). Try instead: svyreg alive female svymean alive, by(female) test [alive]0=[alive]1 in Stata 8, or in Stata 10: svy: reg alive female svy, over(female): mean alive test [alive]0=[alive]1 for a t-test of the difference in mean "alive" siblings (the coefficient on female in the regressions is the difference between boys and girls, so the p-value in that row of output offers a test of no difference). Since the dependent variable is a count, a better specification is svypoisson alive female though the tests should produce near-identical results in practice. On 11/23/07, DEEPANKAR BASU <basu.15@osu.edu> wrote: > I am working with Stata 8. I am working with survey data (DHS data) and am studying fertility behaviour of families. I have complete birth history data for each family in the sample. I wish to test the following hypothesis: girls have, on average, larger number of sibling. > > This is how I proceed. I calculate the number of boys and girls in each family (*nboy* and *ngirl*); then, I do: > > quietly gen alive = nboy + ngirl > quietly gen sibg = (alive - 1) if ngirl > 0 > quietly gen sibb = (alive - 1) if nboy > 0 > > Thus, *sibg* is the number of sibling for girls and *sibb* is number of sibling for boys. Then, I do: > > gen smpwt = v005/1000000 > svyset [pweight=smpwt], psu(v021) strata(v022) > > svymean sibg, subpop(ngirl) > matrix t1 = e(b) > matrix t2 = e(V) > local t11 = e(N) > > svymean sibb, subpop(nboy) > matrix t3 = e(b) > matrix t4 = e(V) > local t33 = e(N) > > gen sibeff = t1[1,1] - t3[1,1] > local g1 = (t1[1,1] - t3[1,1])/sqrt((t2[1,1]/`t11')+(t4[1,1]/`t33')) > > Thus, *sibeff* gives me the difference in the average number of sibling for girls and boys and *g1* gives me the t-statistic for testing whether *sibeff* is significantly different from zero. > > I am getting the t-statistic as much larger than I expected; it is also much smaller if I do not correct for survey design and simply assume that I have a simple random sample. This is making me a little suspicious. My questions: > > 1) Am I making any mistake in my computation or reasoning? > 2) Is there a better way to conduct this t-test? > > I looked at: http://www.ats.ucla.edu/STAT/stata/faq/svyttest.htm > but did not find it useful. > > Thanks in advance. > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: testing equality of means for survey data***From:*DEEPANKAR BASU <basu.15@osu.edu>

- Prev by Date:
**st: testing equality of means for survey data** - Next by Date:
**st: optimal lag order in dynamic panel** - Previous by thread:
**st: testing equality of means for survey data** - Next by thread:
**st: optimal lag order in dynamic panel** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |