Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: svyset with DHS subsample

From	Steven Samuels <[email protected]>
To	[email protected]
Subject	Re: st: svyset with DHS subsample
Date	Wed, 18 May 2011 10:29:57 -0400

--

Sara, I downloaded the 2007 Bangladesh data set and examined the d005 weight.  Indeed it must be normalized as the original weights were (divide by 1,000,000). If you calculate double precision variables, the normalized weights then  exactly sum to the sample size for the DV module.


Steve
[email protected]


On May 18, 2011, at 9:44 AM, Sara Head wrote:

Steve, thanks very much for the response and added info from the RHJ
article -- all very helpful.

I considered the totw, as you mentioned, and saw that it was close but
still somewhat less than the sample size for the DV subsample. I think
I'll write to DHS just for final clarification regarding the DV weight
(there is limited info about the DV weight available in both the
survey report and recode map).

Thanks again - I really appreciate it,
Sara



On Tue, May 17, 2011 at 6:37 PM, Steven Samuels <[email protected]> wrote:
> Sara
> 
> I've never analyzed the DV data and I don't have the DHS manual that describes these variables. But on the face of it, the domestic violence weights must be normalized as well.  That they don't sum to 1,000,000 is probably rounding error.
> 
> Generate the normalized weights as doubles and their sum (egen double totw = total(wt)) will add up to something you recognize, probably the sample size for the DV subsample.
> 
> And,  no,  your -svyset- should not include further and subsampling and stratification stages.
> 
> FYI: other DV researchers (http://www.reproductive-health-journal.com/content/7/1/11) reported that:
> 
> "To permit enough variability within communities in responses to the IPV questions, this analysis only included clusters where 10 or more women were administered the domestic violence module. A small number of clusters (and thus individual-level observations from those clusters) were dropped because too few women were asked about IPV."
> 
> 
> Steve
> 
> Steven J. Samuels
> Consulting Statistician
> 18 Cantine's Island
> Saugerties, NY 12477 USA
> Voice: 845-246-0774
> Fax:   206-202-4783
> [email protected]
> 
> 
> 
> 
> 
> On May 17, 2011, at 1:47 PM, Sara Head wrote:
> 
> Hi there,
> 
> I am setting up Demographic and Health Survey data (from Bangladesh
> 2007, women's survey) for analysis in Stata 11.1.
> 
> The survey is based on a two-stage stratified sample of households.
> Additionally, households were preselected for domestic violence
> questions (outcome variables in my analysis; if there was more than
> one eligible female per household, a respondent was randomly selected
> through a simple selection procedure based on the Kish Grid).
> 
> I've written the svyset commands as:
> 
> gen psu = v021
> gen strata = v023
> gen sampwt = (v005/1000000) //per DHS instruction//
> gen dvsampwt = d005 //no DHS instruction to adjust//
> 
> svyset psu [pweight=dvsampwt], strata(strata)
> 
> where :
> summ psu strata sampwt dvsampwt
> 
>    Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>         psu |     10146    180.7909    104.1327          1        361
>      strata |     10146    10.66499     6.26573          1         22
>      sampwt |     10146    1.004513    .5912652     .13565   3.592687
>    dvsampwt |      4195    996578.1    764690.7     110423   1.08e+07
> 
> 
> 
> I am unsure if this code is correct.
> 1) Since this is a two-stage stratified sample with further selection
> for violence questions, it seems the svyset command should be more
> along the lines of : svyset su1 [pweight=pw], strata(strata) || _n,
> fpc(fpc2) ?
> 2) I used dvsampwt instead of the sampwt variable; I can't find
> information in the survey report / recode map about how the dv weight
> was calculated. I'd like to assume it took the larger sampling design
> into account.
> 
> Any thoughts greatly appreciated,
> Sara
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 



-- 
Sara Head, MPH
PhD Candidate, Rollins School of Public Health
Emory University, Atlanta, Georgia
[email protected], 502-553-9159

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: svyset with DHS subsample
  - From: Sara Head <[email protected]>

References:
- st: svyset with DHS subsample
  - From: Sara Head <[email protected]>
- Re: st: svyset with DHS subsample
  - From: Steven Samuels <[email protected]>
- Re: st: svyset with DHS subsample
  - From: Sara Head <[email protected]>

Prev by Date: Re: st: Interaction in Generalized Ordered Logistic regression_gologit2 (ORs and CIs)
Next by Date: Re: st: RE: generating variable from combination of three categorical variable
Previous by thread: Re: st: svyset with DHS subsample
Next by thread: Re: st: svyset with DHS subsample
Index(es):
- Date
- Thread