# st: testing equality of means for survey data

 From DEEPANKAR BASU To statalist@hsphsun2.harvard.edu Subject st: testing equality of means for survey data Date Fri, 23 Nov 2007 17:51:58 -0500

```I am working with Stata 8. I am working with survey data (DHS data) and am studying fertility behaviour of families. I have complete birth history data for each family in the sample. I wish to test the following hypothesis: girls have, on average, larger number of sibling.

This is how I proceed. I calculate the number of boys and girls in each family (*nboy* and *ngirl*); then, I do:

quietly gen alive = nboy + ngirl
quietly gen sibg = (alive - 1) if ngirl > 0
quietly gen sibb = (alive - 1) if nboy > 0

Thus, *sibg* is the number of sibling for girls and *sibb* is number of sibling for boys. Then, I do:

gen smpwt = v005/1000000
svyset [pweight=smpwt], psu(v021) strata(v022)

svymean sibg, subpop(ngirl)
matrix t1 = e(b)
matrix t2 = e(V)
local t11 = e(N)

svymean sibb, subpop(nboy)
matrix t3 = e(b)
matrix t4 = e(V)
local t33 = e(N)

gen sibeff = t1[1,1] - t3[1,1]
local g1 = (t1[1,1] - t3[1,1])/sqrt((t2[1,1]/`t11')+(t4[1,1]/`t33'))

Thus, *sibeff* gives me the difference in the average number of sibling for girls and boys and *g1* gives me the t-statistic for testing whether *sibeff* is significantly different from zero.

I am getting the t-statistic as much larger than I expected; it is also much smaller if I do not correct for survey design and simply assume that I have a simple random sample. This is making me a little suspicious. My questions:

1) Am I making any mistake in my computation or reasoning?
2) Is there a better way to conduct this t-test?

I looked at: http://www.ats.ucla.edu/STAT/stata/faq/svyttest.htm
but did not find it useful.