# st: svy and hypergeometric distribution

 From Alexander Cavallo <[email protected]> To [email protected] Subject st: svy and hypergeometric distribution Date Mon, 17 Jan 2005 17:19:41 -0600

```I have some statistics questions.  I am refering to Cochran's famous book,
"Sampling Techniques".

1.  Does Stata have a routine to compute a confidence interval for a
proportion using the hypergeometric distribution?
2.  Does svyprop use the normal approximation to compute CIs for the
proportion?
3.  How can I generalize the exact hypergeometric calculation to account
for weights?
4.  How can I generalize the exact hypergeometric calculation to include
weights and strata?

Background - normal approximation for CI for proportion
To compute a confidence interval for a proportion under simple random
sampling, the normal approximation can be used.  The confidence interval
is
p +- {t * sqrt(1-f)  * sqrt[p * (1-p) / (n-1)] + 1/(2*n)
where
p is the sample proportion
q = 1 - p
t is standard normal z score for desired significance level
n is sample size
f = n / N is the sampling rate

Background - hypergeometric distribution for CI for proportion
I understand that if p is too close to 0 or 1, then I should use the
hypergeometric distribution to compute an exact confidence interval (which
may be non-symmetric).  Let H(x, n-x, A, N-A) be the hypergeometric
probability for finding in a sample of size n from a population of size N:
x occurences in the sample and A occurences in the population.
H(x, n-x, A, N-A) = [A choose x] * [(N-A) choose (n-x)] / [N
choose n]
The upper 95% CI for the population number of occurences, A is given by
finding the smallest integer Au such that
sum from j=0 to x {H(j, n-j, Au, N-Au)}<= 0.025.
The lower limit Al for the 95% CI on population occurences is given by the
largest integer Al such that
sum from j=x to n {H(j, n-j, Al, N-Al)}<= 0.025.

Proposal - weighted hypergeometric calculation
Here is what I propose for the svy hypergeometric calculation.  Replace x
(number of occurences) with the weighted version.  Replace n with the sum
of weights.  Round new x and new n to integers and solve the same
equations for Au and Al.  But how would I extend this to stratified
analysis?

Thanks!

--Alex Cavallo

Navigant Consulting, Inc.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```