Thanks Jeff, this was helpful. The results I reported were estimated with no pweight (or iweight) defined in svyset. When I svyset using the approrpiate pweight, svy jackknife, mse yields exactly the same result as my own estimator. And, if I svyset with a new pweight variable defined as 1 for all cases, I replicate the correct result from my example in the former email (se=.08737).
So, as you say, there seems to be some problem with the mse estimator when no pweight is defined that needs to be fixed.
While fixing this problem, couldn't you implement a jackknife svy estimator that allows the correlation command? I know the problem now is that corr doesn't allow pweights (or iweigths), but couldn't that be fixed in the first place (that is also very annoying, that you can't weight your survey data appropriately when running correlation coefficients without using the cumbersome fweights). I guess no well-defined linearized variance estimator exists for the correlation coefficient (at least I have never seen one), so this is an area where jackknife (or, if the design allows for it, BRR) would really be a necessary tool!
All the best,
Jan
________________________________
Från: owner-statalist@hsphsun2.harvard.edu genom Jeff Pitblado, StataCorp LP
Skickat: må 2006-03-13 22:34
Till: statalist@hsphsun2.harvard.edu
Ämne: Re: st: svy jackknife problems
Jan Teorell <jan.teorell@pol.gu.se> is having trouble reproducing hand-coded
jackknife variance estimates using -svy jackknife-:
> I'm having trouble with the svy jackknife command. I had earlier implemented
> a crewd jackknife estimator myself, tailored for my particular complex
> survey design including both stratification and multistage cluster sampling.
> With Stata 9 I presumably should need to use this homemade program anymore,
> since svy jackknife should do the job for me. However, the results from my
> estimator and svy jackknife differs for reasons I am not quite clear of.
> To take this down to a more concrete level, I tested the two commands on a
> small subsample of my survey, using only 2 strata with 2 PSU:s each. I then
> ran a simple regression, with the following estimates per replication (where
> b(h,j)=the regression coefficient received when excluding PSU j from stratum
> h):
> b(1,1)=.4230769
> b(1,2)=.5417409
> b(2,1)=.5537783
> b(2,2)=.4259508
> The estimate from the entire sample is: b=.4866513
> Plugging in these estimates into the formula for the mse estimator (Survey
> Data Manual, p. 266) yields:
> 1/2*[(.4230769-.4866513)^2+(.5417409-.4866513)^2] +
> 1/2*[(.5537783-.4866513)^2+(.4259508-.4866513)^2]
> which is aproximately equal to .0076336. The square root of this, that is,
> the estimate of the standard error is: .0873703.
> Incidentally, this is what my homemade jackknife estimator arrives at.
> However, svy jackknife reaches a somewhat different conclusion: se =
> .1070064
> This is so despite the fact that the same estimated b(h,j)-coefficients go
> into both procedures (I have checked this by running jackknife noisily).
> There also appears to be nothing wrong with the weights: the "sum of wgt
> is..." yields exactly similar results.
> So what could be wrong? What could explain the difference?
Although I can't be sure without seeing how Jan -svyset- the data, it seems
the culprit here is -svyset- -iweight-s.
As far as -svy- is concerned, -iweight-s and -pweight-s are the same except
that -iweight-'s are allowed to be negative. However, looking into
reproducing Jan's results, we found that -svyset-ting -iweight-s and using the
-mse- option with -svy jackknife- can result in different variance/standard
error estimates than -svyset-ting -pweight-s. This should be fixed in the
next ado-file update.
Until then Jan should be able to reproduce the correct results by -svyset-ting
-pweight-s and re-running -svy jackknife-.
--Jeff
jpitblado@stata.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
<<winmail.dat>>