Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: svyset with DHS subsample

From   Steven Samuels <>
Subject   Re: st: svyset with DHS subsample
Date   Tue, 17 May 2011 18:37:41 -0400


I've never analyzed the DV data and I don't have the DHS manual that describes these variables. But on the face of it, the domestic violence weights must be normalized as well.  That they don't sum to 1,000,000 is probably rounding error.

Generate the normalized weights as doubles and their sum (egen double totw = total(wt)) will add up to something you recognize, probably the sample size for the DV subsample.  

And,  no,  your -svyset- should not include further and subsampling and stratification stages.

FYI: other DV researchers ( reported that: 

"To permit enough variability within communities in responses to the IPV questions, this analysis only included clusters where 10 or more women were administered the domestic violence module. A small number of clusters (and thus individual-level observations from those clusters) were dropped because too few women were asked about IPV."


Steven J. Samuels
Consulting Statistician
18 Cantine's Island
Saugerties, NY 12477 USA
Voice: 845-246-0774
Fax:   206-202-4783

On May 17, 2011, at 1:47 PM, Sara Head wrote:

Hi there,

I am setting up Demographic and Health Survey data (from Bangladesh
2007, women's survey) for analysis in Stata 11.1.

The survey is based on a two-stage stratified sample of households.
Additionally, households were preselected for domestic violence
questions (outcome variables in my analysis; if there was more than
one eligible female per household, a respondent was randomly selected
through a simple selection procedure based on the Kish Grid).

I've written the svyset commands as:

gen psu = v021
gen strata = v023
gen sampwt = (v005/1000000) //per DHS instruction//
gen dvsampwt = d005 //no DHS instruction to adjust//

svyset psu [pweight=dvsampwt], strata(strata)

where :
summ psu strata sampwt dvsampwt

    Variable |       Obs        Mean    Std. Dev.       Min        Max
         psu |     10146    180.7909    104.1327          1        361
      strata |     10146    10.66499     6.26573          1         22
      sampwt |     10146    1.004513    .5912652     .13565   3.592687
    dvsampwt |      4195    996578.1    764690.7     110423   1.08e+07

I am unsure if this code is correct.
1) Since this is a two-stage stratified sample with further selection
for violence questions, it seems the svyset command should be more
along the lines of : svyset su1 [pweight=pw], strata(strata) || _n,
fpc(fpc2) ?
2) I used dvsampwt instead of the sampwt variable; I can't find
information in the survey report / recode map about how the dv weight
was calculated. I'd like to assume it took the larger sampling design
into account.

Any thoughts greatly appreciated,

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index