Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Need help on variance estimation using replication methods while incorporating raking

From   Stas Kolenikov <>
Subject   Re: st: Need help on variance estimation using replication methods while incorporating raking
Date   Tue, 22 Nov 2011 19:22:53 -0600

On Tue, Nov 22, 2011 at 4:53 PM, Bilal Khan <> wrote:
> I  have a multistage or complex survey data which I raked using some auxiliary variables like previous votes in different elections (highly correlated with output variables). Now I want to find 95 percent confidence intervals for my estimated and I believe I can use svy option in stata to calculate such estimates through Taylors series. However, this would not incorporate raking into design which perhaps may lower the sampling error. I can trim weights as well but I would lose precision of estimates.
> So I plan to use replication methods which I believe can cater for raking or post stratification. I do have the commands to do so in stata but I am not sure how to create replicate weights before using replication methods for survey variance estimates. Can anyone suggest an easy way to create replication weights and incorporate replicated weight; especially bootstrap in variance estimation. Also raking tends to decrease sampling variance. Would this be reflected in the variance estimation using replication methods like bootstrap or Jackknife?

As mentioned already by Steve, you would have to apply the same raking
procedure, with the same marginals and all, to each bootstrap
replicate. My -bsweights- procedure allows for that.

There's also -calibest- command out there, which I was not happy about
and wrote my own version of, for my own use in calibration estimation
work (although I don't remember what the issues were; it might have
been that it was not really returning the results as an -eclass-
command should have, so I wrote a bunch of tweaky -ereturn- and
-estimates repost- to achieve that goal). This would be an equivalent
to linearization estimation based on calibrated estimators, and much
faster than the bootstrap.

You might also be able to utilize the post-stratification features of
Stata if your marginals are categorical variables, so the sample can
be broken into a finite number of post-strata. See [SVY]

To Steve: my understanding is that calibration, such as raking, does
reduce variances for a well-designed survey with no non-response, when
all of the survey error is the sampling error. Calibration then
reduces the sampling error of descriptive summaries of the variables
correlated with the calibration variables, but increases the sampling
error of say regression or logistic regression estimates, as it
increases variability of weights. Of course if you have a 70%
non-response, then hell with the variances, let's just concentrate on
getting any useful information that would be approximately right, to
begin with.

Stas Kolenikov, also found at
Small print: I use this email account for mailing lists only.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index