# Re: st: svy proportion - confidence intervals

 From jpitblado@stata.com (Jeff Pitblado, StataCorp LP) To statalist@hsphsun2.harvard.edu Subject Re: st: svy proportion - confidence intervals Date Wed, 05 Sep 2007 17:54:46 -0500

```Hillel Alpert <HALPERT@hsph.harvard.edu> asks about confidence intervals from
the -svy: proportion- command:

> Could someone advise, please? The confidence intervals with svy: proportion
> (using Stata 10) do not have the heading "Binomial Wald" as do the examples
> in the Survey manual (Stata 9) Are they binomial? If not, can the binomial
> confidence intervals be generated with survey proportions?

In Stata 10, we removed the 'Binomial Wald' heading from -proportion- and
-svy: proportion- output because it was mislabeling how the confidence
intervals (CIs) were being computed.

-proportion- and -svy: proportion- compute the CIs using the following formula

phat +- t_value * sehat(phat)

where phat is the estimated proportion, t_value is the critical value
(associated with the level of confidence), and sehat(phat) is the estimated
standard error of phat.

'Binomial Wald' CIs are reported by the -ci- command, when the options
-binomial- and -wald- are both specified. These CIs are computed using the
following formula

phat +- z_value * sehat(phat)

The distinction between the above two formula is the use of the standard
normal distribution for the critical value instead of Student's t.

Another distinction between -svy: proportion- and -ci- is how sehat(phat) is
computed.  The standard error of a sample proportion from data collected using
a simple random sample design (with replacement or with a very small sampling
fraction), the default assumption for the -ci- command, is computed using

sehat(phat)	= sqrt(phat(1-path)/N)

where N is the number of observations.  With a little algebra, one can show
that the usual formula for the standard error of the sample mean (of which
phat is a special case, being the sample mean of 0,1 values) results in the
same value.  Thus -ci- and -proportion- (no need for -svy:- for this type of
SRS design) compute and report the same value for the estimated standard error
of the sample proportion.

This property does not hold for other survey designs.  In that case, one is
better off correctly -svyset-ting their data and using -svy: proportion-.  The
last section of -[SVY] variance estimation-, titled 'Confidence intervals',
briefly discusses (with references) why the -svy- prefix uses the t
distribution for computing CIs.

Korn and Graubard (1999) (pp. 64-68) propose some alternative methods for
computing CIs for the sample proportion when successes are rare.
Unfortunately, none of these methods are implemented in -svy: proportion-.

Reference:

Korn, E.L. and B.I. Graubard.  1999.  Analysis of Health Surveys.
New York: Wiley.