[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: implementation of variance formula

From	Inna Becher <[email protected]>
To	[email protected]
Subject	Re: st: implementation of variance formula
Date	Fri, 13 Mar 2009 16:55:49 +0100

Dear Stas,

the variance formula I use is

var=sum over k=1<k sum over m=1<ky[k]*y[m]*(p[km]-p[k]*p[m])/(p[k]*p[m]*p[km]),

where y[k] is the variable of interest in the network k, y[m] is thesame for network m, the summation is over distinct networks representedin the sample, p[m] is the probability of thenetwork m to be included in the variance estimator an p[km] is the pairprobability of selection.I'm not very familiar with Mata yet. But my instinct says: the variancecomputation is best to made by means of Mata...My dataset is a population data, so I do not really have to simulatedata, but I'm using -simulate- to draw 1000 replications of a sample andstoring means and variances (the variances mentioned above are not yetimplemented... )


Stas Kolenikov schrieb:

Tell us more about the problem. As far as I know, sampling networks is
heck of a mess. To simulate anything, you would need to have almost
perfect understanding how your network was formed. Any simulation is
just as good as the model to create the data that was used in that
simulation. And survey bootstrap is a moderately crazy topic. No
textbook covers it sufficiently well, unfortunately. Certainly not in
Efron's book; there is a chapter in Shao & Tu (1995) Springer book,
but it only covered stuff until late 1980s. The newer (and important!)
methods are only out there in the papers.

Yates-Grundy-Sen variance estimator for Horvitz-Thompson estimator is

sum over j<k (p[j]*p[k] - p[j,k])  (y[j]/p[j] - y[k]/p[k])^2

If you can write Mata functions to compute the unit and pair
probabilities of selection, you can have a pretty compact code for
your variance estimator. You won't have to store the huge matrices of
pairwise selection probabilities that likely have well structured form
if you talk about cluster sampling.

On Wed, Mar 11, 2009 at 3:49 AM, Inna Becher
<[email protected]> wrote:

I can calculate the probability for each network (=cluster) to be included
in the sample. I also can
calculate for each pair of selected clusters to be included in the sample.
My problem is: this probabilities are to be saved somewhere. Should it be a
matrix? I have not yet worked with matrices to calculate variances. The
version of H-T-estimator I need is not implemented in svy-.
I wrote an ado for sampling design that I need and implemented H-T-estimator
for the mean, but not for the variance.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: implementation of variance formula
  - From: Inna Becher <[email protected]>
- Re: st: implementation of variance formula
  - From: Steven Samuels <[email protected]>
- Re: st: implementation of variance formula
  - From: Inna Becher <[email protected]>
- Re: st: implementation of variance formula
  - From: Stas Kolenikov <[email protected]>

Prev by Date: RE: st: RE: Re: RE: Dialogue box
Next by Date: Re: st: Stata output into Word
Previous by thread: Re: st: implementation of variance formula
Next by thread: st: problem in the preparation of my dataset to run a conditional logit model
Index(es):
- Date
- Thread