    # Re: st: Variance of ratio estimator

 From jpitblado@stata.com (Jeff Pitblado, StataCorp LP) To statalist@hsphsun2.harvard.edu Subject Re: st: Variance of ratio estimator Date Mon, 30 Oct 2006 10:21:19 -0600

```Balduccio Attacchini asked for the formula Stata's -svy: ratio- command uses
for estimating the variance for a single stage design with a finite population
correction:

> I'd like to know the formula adopted by Stata in calculating the
> variance estimator for ratio estimator in stratified single-stage design
> and with a finite population correction.
> In particular, when I set
> svyset_n, strata(..) fpc(..)
> and then I run
> svy: ratio y_total/x_total
> The command returns an information labelled as 'Linearized Std. Err.'
> The formula showed in page 263 of Survey data-Release 9(Stata) does not
> solve my question.
> However the manual does not allow the case for Stratified single-stage
> design and Finite population correction (even the link to pag 259 is
> unclear being referred only to the variance estimation of a total).
> So what is the formula behind the Linearized STd. Err. returned by the
> following commands?:
> svyset _n, strata(..) fpc(..)
> svy: ratio y_total/x_total

The formula Stata uses to estimate the variance of a total estimated from a
single stage design (including stratification and an FPC) is given by (1) on
page 259 of '[SVY] variance estimation'.

This formula is the foundation for estimating the variance for all other point
estimates via linearization.

In the case of the ratio estimator, the score variable {also known as the
'linearized variable' according to Deville (1999) and Demnati and Rao (2004)}
is plugged into (1) in place of the 'y' values as described under the heading
'Ratios and other functions of survey data' on page 262.

The estimator for the population ratio is

Rhat = Yhat/Xhat

where Yhat is the population total estimator for 'Y' and Xhat is similarly
defined for 'X'.  The score variable for Rhat is

z_j = (y_j - Rhat * x_j)/Xhat

These z_j values are used to compute the weighted PSU totals (y_hi) and
stratum means (ybar_h) used in (1) on page 259 of '[SVY] variance estimation'.

Balduccio's -svyset- indicates a special case of the single stage design; the
'_n' specifies that the PSUs are the individual observations instead of
clusters.   Thus the weighted PSU totals to be plugged into (1) are simply the
individually weighted values of the score variable.

References:

Demnati, A.  and J. N. K. Rao. 2004.  Linearization variance estimators for
survey data.  Survey Methodology 30: 17--26.

Deville, J.-C. 1999.  Variance estimation for complex statistics and
estimators:  Linearization and residual techniques.  Survey
Methodology 25: 193--203.

--Jeff