Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Variance of ratio estimator


From   jpitblado@stata.com (Jeff Pitblado, StataCorp LP)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Variance of ratio estimator
Date   Mon, 30 Oct 2006 10:21:19 -0600

Balduccio Attacchini asked for the formula Stata's -svy: ratio- command uses
for estimating the variance for a single stage design with a finite population
correction:

> I'd like to know the formula adopted by Stata in calculating the
> variance estimator for ratio estimator in stratified single-stage design 
> and with a finite population correction.
> In particular, when I set
> svyset_n, strata(..) fpc(..)
> and then I run
> svy: ratio y_total/x_total
> The command returns an information labelled as 'Linearized Std. Err.'
> The formula showed in page 263 of Survey data-Release 9(Stata) does not 
> solve my question.
> However the manual does not allow the case for Stratified single-stage 
> design and Finite population correction (even the link to pag 259 is 
> unclear being referred only to the variance estimation of a total).
> So what is the formula behind the Linearized STd. Err. returned by the 
> following commands?:
> svyset _n, strata(..) fpc(..)
> svy: ratio y_total/x_total

The formula Stata uses to estimate the variance of a total estimated from a
single stage design (including stratification and an FPC) is given by (1) on
page 259 of '[SVY] variance estimation'.

This formula is the foundation for estimating the variance for all other point
estimates via linearization.

In the case of the ratio estimator, the score variable {also known as the
'linearized variable' according to Deville (1999) and Demnati and Rao (2004)}
is plugged into (1) in place of the 'y' values as described under the heading
'Ratios and other functions of survey data' on page 262.

The estimator for the population ratio is

	Rhat = Yhat/Xhat

where Yhat is the population total estimator for 'Y' and Xhat is similarly
defined for 'X'.  The score variable for Rhat is

	z_j = (y_j - Rhat * x_j)/Xhat

These z_j values are used to compute the weighted PSU totals (y_hi) and
stratum means (ybar_h) used in (1) on page 259 of '[SVY] variance estimation'.

Balduccio's -svyset- indicates a special case of the single stage design; the
'_n' specifies that the PSUs are the individual observations instead of
clusters.   Thus the weighted PSU totals to be plugged into (1) are simply the
individually weighted values of the score variable.

References:

Demnati, A.  and J. N. K. Rao. 2004.  Linearization variance estimators for
	survey data.  Survey Methodology 30: 17--26.

Deville, J.-C. 1999.  Variance estimation for complex statistics and
	estimators:  Linearization and residual techniques.  Survey
	Methodology 25: 193--203.

--Jeff
jpitblado@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index