Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: svy jackknife problems


From   jpitblado@stata.com (Jeff Pitblado, StataCorp LP)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: svy jackknife problems
Date   Mon, 13 Mar 2006 15:34:53 -0600

Jan Teorell <jan.teorell@pol.gu.se> is having trouble reproducing hand-coded
jackknife variance estimates using -svy jackknife-:

> I'm having trouble with the svy jackknife command. I had earlier implemented
> a crewd jackknife estimator myself, tailored for my particular complex
> survey design including both stratification and multistage cluster sampling.
> With Stata 9 I presumably should need to use this homemade program anymore,
> since svy jackknife should do the job for me. However, the results from my
> estimator and svy jackknife differs for reasons I am not quite clear of.

> To take this down to a more concrete level, I tested the two commands on a
> small subsample of my survey, using only 2 strata with 2 PSU:s each. I then
> ran a simple regression, with the following estimates per replication (where
> b(h,j)=the regression coefficient received when excluding PSU j from stratum
> h):

> b(1,1)=.4230769
> b(1,2)=.5417409 
> b(2,1)=.5537783
> b(2,2)=.4259508 

> The estimate from the entire sample is: b=.4866513

> Plugging in these estimates into the formula for the mse estimator (Survey
> Data Manual, p. 266) yields:

> 1/2*[(.4230769-.4866513)^2+(.5417409-.4866513)^2] +
> 1/2*[(.5537783-.4866513)^2+(.4259508-.4866513)^2]

> which is aproximately equal to .0076336. The square root of this, that is,
> the estimate of the standard error is: .0873703.

> Incidentally, this is what my homemade jackknife estimator arrives at.
> However, svy jackknife reaches a somewhat different conclusion: se =
> .1070064

> This is so despite the fact that the same estimated b(h,j)-coefficients go
> into both procedures (I have checked this by running jackknife noisily).
> There also appears to be nothing wrong with the weights: the "sum of wgt
> is..." yields exactly similar results.

> So what could be wrong? What could explain the difference?

Although I can't be sure without seeing how Jan -svyset- the data, it seems
the culprit here is -svyset- -iweight-s.

As far as -svy- is concerned, -iweight-s and -pweight-s are the same except
that -iweight-'s are allowed to be negative.  However, looking into
reproducing Jan's results, we found that -svyset-ting -iweight-s and using the
-mse- option with -svy jackknife- can result in different variance/standard
error estimates than -svyset-ting -pweight-s.  This should be fixed in the
next ado-file update.

Until then Jan should be able to reproduce the correct results by -svyset-ting
-pweight-s and re-running -svy jackknife-.

--Jeff
jpitblado@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2020 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index