[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: svy proportion - confidence intervals

From	[email protected] (Jeff Pitblado, StataCorp LP)
To	[email protected]
Subject	Re: st: svy proportion - confidence intervals
Date	Wed, 05 Sep 2007 17:54:46 -0500

Hillel Alpert <[email protected]> asks about confidence intervals from
the -svy: proportion- command:

> Could someone advise, please? The confidence intervals with svy: proportion
> (using Stata 10) do not have the heading "Binomial Wald" as do the examples
> in the Survey manual (Stata 9) Are they binomial? If not, can the binomial
> confidence intervals be generated with survey proportions?

In Stata 10, we removed the 'Binomial Wald' heading from -proportion- and
-svy: proportion- output because it was mislabeling how the confidence
intervals (CIs) were being computed.

-proportion- and -svy: proportion- compute the CIs using the following formula

	phat +- t_value * sehat(phat)

where phat is the estimated proportion, t_value is the critical value
(associated with the level of confidence), and sehat(phat) is the estimated
standard error of phat.

'Binomial Wald' CIs are reported by the -ci- command, when the options
-binomial- and -wald- are both specified. These CIs are computed using the
following formula 

	phat +- z_value * sehat(phat)

The distinction between the above two formula is the use of the standard
normal distribution for the critical value instead of Student's t.

Another distinction between -svy: proportion- and -ci- is how sehat(phat) is
computed.  The standard error of a sample proportion from data collected using
a simple random sample design (with replacement or with a very small sampling
fraction), the default assumption for the -ci- command, is computed using

	sehat(phat)	= sqrt(phat(1-path)/N)

where N is the number of observations.  With a little algebra, one can show
that the usual formula for the standard error of the sample mean (of which
phat is a special case, being the sample mean of 0,1 values) results in the
same value.  Thus -ci- and -proportion- (no need for -svy:- for this type of
SRS design) compute and report the same value for the estimated standard error
of the sample proportion.

This property does not hold for other survey designs.  In that case, one is
better off correctly -svyset-ting their data and using -svy: proportion-.  The
last section of -[SVY] variance estimation-, titled 'Confidence intervals',
briefly discusses (with references) why the -svy- prefix uses the t
distribution for computing CIs.

Korn and Graubard (1999) (pp. 64-68) propose some alternative methods for
computing CIs for the sample proportion when successes are rare.
Unfortunately, none of these methods are implemented in -svy: proportion-.

Reference:

Korn, E.L. and B.I. Graubard.  1999.  Analysis of Health Surveys.
	New York: Wiley.

--Jeff Pitblado
  [email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: Problem with centile and normal confidence limits
Next by Date: st: RE: swapping levels in mixed models
Previous by thread: st: svy proportion - confidence intervals
Next by thread: st: ID creation with shp2dta
Index(es):
- Date
- Thread