# Re: st: implementation of variance formula

 Wed, 11 Mar 2009

I can calculate the probability for each network (=cluster) to be included in the sample. I also can calculate for each pair of selected clusters to be included in the sample. My problem is: this probabilities are to be saved somewhere. Should it be a matrix? I have not yet worked with matrices to calculate variances. The version of H-T-estimator I need is not implemented in svy-. I wrote an ado for sampling design that I need and implemented H-T-estimator for the mean, but not for the variance.
Steven Samuels schrieb:
You don't say which version of the H-T estimator you want; there are many versions. The estimates themselves depend on knowing for each cluster the probability that would be included in the sample. This quantity must be supplied with the data set. It might or might not be a simple function of the cluster "size" measure.
Thel formulas for the variance of the classical H-T estimators are also functions of the probability that each pair of selected clusters would be included in the sample; if there are m clusters in a stratum, there are m(m-1)/2 of these probabilities. Were they supplied with the data set? Even if you have them, you would still have to write your own (probably MATA) code to utilize them.
There is an alternative. Stata's survey commands produce modified H-T estimates . You can obtain appropriate standard errors if you -syset- your data according to the design.
Mar 10, 2009

```Dear statalisters,

I have to implement a formula of the variance of modified horvitz-thompson-estimator. My dataset is very large, so I cannot produce a lot of new variables in order to do that. Should I use mata? Are there any examples of implementing variance formulas in stata?
