Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: Bootstrap and pweights

From   "R.E. De Hoyos" <>
To   <>
Subject   st: Re: Bootstrap and pweights
Date   Tue, 25 Oct 2005 12:48:27 +0100


A good reference on the subject is:

Deaton, A. (1997) "The analysis of household surveys: a microeconometric approach to development policy", World Bank

It uses Stata (version 4.0) in its applications and the do-files are available online:

As I understand, the problem of bootstrapping with weighted data (pweights) rests in the difficulty of respecting the total population where the data comes from. In the case of survey data, multiplying the observations by its respective -pweights- will give you the total population. When you use -bootstrap- or -bsample- the data generated violate this property.

A suggestion could be to expand the data using the -pweights- and then carry on the resampling, however if the survey design is a complex one --very likely--, you wont be able to reproduce the population with your bootstrapped sample. A second possible solution is to make the resampling within each value of -pweight- (usually the PSU or Strata or a combination of both under complex survey data). In this case you will be able to reproduce the total population.

see: -svybsamp- and -svybsamp2-

I hope this helps,


----- Original Message ----- From: "Jann Ben" <>
To: <>
Sent: Tuesday, October 25, 2005 11:36 AM
Subject: st: Bootstrap and pweights

Hi, the issue of using bootstrap techniques with weighted
data has been discussed several times on statalist. For
example, see

The basic message is that bootstrap cannot be used with
probability weights (pweights) in Stata. This is bad news
and I think it would be very valuable to change that.

Do I understand right, that the only problem with using
pweights is that weighted sampling is not implemented in
Stata's -bsample- command? In other words: Is it true that
the bootstrap technique produces accurate results for
weighted data if the weights are accounted for via a
weighted sampling design?

Furthermore: Is there a fundamental difficulty with
weighted sampling? (I.e. why is weighted sampling not
implemented in -bsample-?) How could weighted sampling be
implemented in -bsample-? (References?)

Any help will be greatly appreciated!


PS: Am I the only one who feels bothered by the no-pweights
restriction of the bootstrap command? Is it true that
-svy jackknive- could be an alternative?

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index