Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Pooling DHS surveys: svyset command?

From   Steve Samuels <[email protected]>
To   [email protected]
Subject   Re: st: Pooling DHS surveys: svyset command?
Date   Tue, 24 Jul 2012 10:38:26 -0400


If the information you provide is correct. our -svyset- statement is OK.  However your description of the 2000 sample appears to be inaccurate, as 11 districts were designated for oversampling ( These do not appear to be "regions", but rather sub-strata of the regions to which they belong. So I doubt that "v024" designates regions in 2000, but leave it to you to check. 

[email protected]

On Jul 24, 2012, at 9:13 AM, <[email protected]> <[email protected]> wrote:

Dear Stata list serve members,

I am pooling three years of Malawi DHS data (2000, 2004 and 2010). How
would you recommend that I take into account the survey design given psu
and strata variables and adjustment weights? 

The sampling designs differ across surveys and seem to be as follows:
- DHS 2010: stratified by district then urban/rural. Clusters were
selected using probability-proportionate to size (PPS) (frame was the
2008 census enumeration units) 
- DHS 2004: stratified by region then urban/rural. Clusters were then
clusters selected using PPS (Frame 1998 census enumeration units)
- DHS 2000: stratified by region then urban/rural. Clusters were
selected through systematic sampling (Frame 1998 census enumeration

Instructions on the DHS website say do the following: 
- to generate weight: generate weight = v005/1000000 
- to make unique strata values depending on how sampling design (in this
case already done for 2010, and 2004 and 2000 v025 and v024 represent
region and urban/rural variables): egen strata = group(v024 v025), label

DHS website, however, does not appear to indicate how to take into
account survey design when surveys are pooled.

On the Stata ListServe the following recommendations are provided on a
related, but non-DHS, question

If your surveys were stratified, to begin with, then it would become: 
svyset psuXyear [pw=weight in each wave], strata(waveXoriginal_strata)

where -X- stands for interaction along the lines of:
egen psuXyear = group(psu year)

Svyset command I use is as follows:
svyset psuXyear [pw=weight], strata(strataXyear) singleunit(scaled)

For pooling DHS surveys, does this svyset command look appropriate?

Best regards,

Please access the attached hyperlink for an important electronic communications disclaimer:

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index