Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: JK replicate weights w/o pweight


From   Stas Kolenikov <skolenik@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: JK replicate weights w/o pweight
Date   Sun, 30 Oct 2011 23:02:38 -0500

It is approximately kosher.

The jackknife weights are generally computed as follows:
(i) original weight, most of the time;
(ii) zero, in one and only one replicate;
(iii) somewhat larger than the original weight, in a few other replicates.

The row average of the weights is correct, given the arithmetics of
how the weights are calculated. You can also produce the original
weights as the row median, too:

egen jkwgt1 = rowmedian( fwt* )
egen jkwgt2 = rowmean( fwt* )
assert reldif( jkwgt1, jkwgt2) < 10*c(epsfloat)

If you have Fay-corrected jackknife weights, you won't see zeroes, and
taking the mean may or may not be right, but the median computation
will certainly be.

I am somewhat concerned that whoever created the data set referred to
the weights as fweights, though. SAS/STAT does not care (PROC SURVEY*
does though), and always treats weights as frequency weights,
providing a source of a constant entertainment with chi-square tests
of the order 1e+8. Stata users are trained early to know that
frequency weights and probability weights are totally different
animals (http://stata.com/help.cgi?weights).

On Sun, Oct 30, 2011 at 1:29 PM, Ryan Edwards <REdwards@gc.cuny.edu> wrote:
> I've got a public dataset issued by a government agency that contains 100 jackknife replicate weights (fwt1-fwt100) but not an overall probability weight, presumably redacted for privacy concerns.  SAS code from the agency seems to indicate that SAS doesn't care if the probability weight is missing:
>
> proc surveyfreq data=pub;
>  repweight fwt1--fwt100;
>  tables x1 x2 / alpha=0.1;
> run;
>
> But in Stata, while svyset will allow me to skip the probability weight, it will not produce properly-weighted survey statistics:
>
> . svyset , jkrweight(fwt*)
> . svy: tab x1
>
> The output is just the unweighted frequencies.
>
> Any ideas? In the Stata 12 manuals, I don't see any examples that match mine, i.e. without an overall probability weight.  I have tested a completely ad hoc workaround, generating the average of all the 100 replicate weights and using that as the overall probability weight, which seems to produce a match to the SAS output.  But I have little idea of whether that's kosher and would guess it's not.  I'd love not to have to learn some SAS;  I'm a Mac user and would have to horse around with a different platform!
>
> Ryan Edwards
> Associate Professor of Economics
> Queens College and the Graduate Center
> City University of New York
> redwards@qc.cuny.edu
> http://qcpages.qc.cuny.edu/~redwards/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index