Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Re: svyset for stratified probability proportional to size design

From   Nick Winter <>
Subject   Re: st: Re: svyset for stratified probability proportional to size design
Date   Thu, 11 Aug 2005 13:05:07 -0400

You only -svyset- one weight variable--which correspond to the ultimate probability of selection of each observation. With multiple stages of sampling, these weights will be a function of the probabilities of selection at each stage, possibly with additional corrections for non-response, etc. So the sample might be self-weighting at the first stage, but perhaps not at lower stages, resulting in non-constant weights for the ultimate observations.

So you would do something like:

. svyset psu1 [pw=weight] , str(strat1) || psu2 , fpc(fpc2) || _n , fpc(fpc3)

Note, however, that if the first stage is actually sampled with replacement, than lower-level without-replacement sampling is ignored. This will be conservative--your reported standard errors will be somewhat larger than they would otherwise be.

--Nick Winter

At 02:37 AM 8/11/2005, you wrote:

We run on Stata 9, updated all July 5.

We have a complex survey design data set with the first stage stratified
into two parts.  The primary sampling units (census tracts) have been drawn
with replacement from each of the two strata with probability proportional
to population size (PPS) in each of the strata.

We have the total population, the population of each stratum, and the
population of each PSU.  There are two further stages in the sampling
design, random sampling of 3 blocks within each PSU without replacement and
random sampling of 2 households within each block without replacement.

Our difficulty lies in specifying specifying svyset for this design,
particularly selecting (or not) the pw weights for the first level.  On the
one hand, we read from texts that the glory of probability proportional to
population sample is that it doesn't need weights.  On the other hand, we
see from svyset examples at the UCLA site, that considerable effort has
gone into calculating the values for the first stage weights with PPS
design,  though without access to the text, we are unclear how to apply their

Could any suggest how to at least set up svyset for the first stage of the
above described design?

 Many thanks,
Steve Rothenberg
Instituto Nacional de Salud Pública
Cuernavaca, México

*   For searches and help try:
Nicholas J. G. Winter 607.255.8819 t
Assistant Professor 607.255.4530 f
Department of Government e
Cornell University w
308 White Hall
Ithaca, NY 14853-4601

* For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index