Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: svyset with DHS subsample
From
Sara Head <[email protected]>
To
[email protected]
Subject
Re: st: svyset with DHS subsample
Date
Wed, 18 May 2011 12:49:51 -0400
Steve, thanks very much for your follow-up (I definitely would not
have been able to figure this out on my own) -- I really appreciate
it.
Sara
On Wed, May 18, 2011 at 10:29 AM, Steven Samuels <[email protected]> wrote:
>
> --
>
> Sara, I downloaded the 2007 Bangladesh data set and examined the d005 weight. Indeed it must be normalized as the original weights were (divide by 1,000,000). If you calculate double precision variables, the normalized weights then exactly sum to the sample size for the DV module.
>
>
> Steve
> [email protected]
>
>
> On May 18, 2011, at 9:44 AM, Sara Head wrote:
>
> Steve, thanks very much for the response and added info from the RHJ
> article -- all very helpful.
>
> I considered the totw, as you mentioned, and saw that it was close but
> still somewhat less than the sample size for the DV subsample. I think
> I'll write to DHS just for final clarification regarding the DV weight
> (there is limited info about the DV weight available in both the
> survey report and recode map).
>
> Thanks again - I really appreciate it,
> Sara
>
>
>
> On Tue, May 17, 2011 at 6:37 PM, Steven Samuels <[email protected]> wrote:
> > Sara
> >
> > I've never analyzed the DV data and I don't have the DHS manual that describes these variables. But on the face of it, the domestic violence weights must be normalized as well. That they don't sum to 1,000,000 is probably rounding error.
> >
> > Generate the normalized weights as doubles and their sum (egen double totw = total(wt)) will add up to something you recognize, probably the sample size for the DV subsample.
> >
> > And, no, your -svyset- should not include further and subsampling and stratification stages.
> >
> > FYI: other DV researchers (http://www.reproductive-health-journal.com/content/7/1/11) reported that:
> >
> > "To permit enough variability within communities in responses to the IPV questions, this analysis only included clusters where 10 or more women were administered the domestic violence module. A small number of clusters (and thus individual-level observations from those clusters) were dropped because too few women were asked about IPV."
> >
> >
> > Steve
> >
> > Steven J. Samuels
> > Consulting Statistician
> > 18 Cantine's Island
> > Saugerties, NY 12477 USA
> > Voice: 845-246-0774
> > Fax: 206-202-4783
> > [email protected]
> >
> >
> >
> >
> >
> > On May 17, 2011, at 1:47 PM, Sara Head wrote:
> >
> > Hi there,
> >
> > I am setting up Demographic and Health Survey data (from Bangladesh
> > 2007, women's survey) for analysis in Stata 11.1.
> >
> > The survey is based on a two-stage stratified sample of households.
> > Additionally, households were preselected for domestic violence
> > questions (outcome variables in my analysis; if there was more than
> > one eligible female per household, a respondent was randomly selected
> > through a simple selection procedure based on the Kish Grid).
> >
> > I've written the svyset commands as:
> >
> > gen psu = v021
> > gen strata = v023
> > gen sampwt = (v005/1000000) //per DHS instruction//
> > gen dvsampwt = d005 //no DHS instruction to adjust//
> >
> > svyset psu [pweight=dvsampwt], strata(strata)
> >
> > where :
> > summ psu strata sampwt dvsampwt
> >
> > Variable | Obs Mean Std. Dev. Min Max
> > -------------+--------------------------------------------------------
> > psu | 10146 180.7909 104.1327 1 361
> > strata | 10146 10.66499 6.26573 1 22
> > sampwt | 10146 1.004513 .5912652 .13565 3.592687
> > dvsampwt | 4195 996578.1 764690.7 110423 1.08e+07
> >
> >
> >
> > I am unsure if this code is correct.
> > 1) Since this is a two-stage stratified sample with further selection
> > for violence questions, it seems the svyset command should be more
> > along the lines of : svyset su1 [pweight=pw], strata(strata) || _n,
> > fpc(fpc2) ?
> > 2) I used dvsampwt instead of the sampwt variable; I can't find
> > information in the survey report / recode map about how the dv weight
> > was calculated. I'd like to assume it took the larger sampling design
> > into account.
> >
> > Any thoughts greatly appreciated,
> > Sara
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/help.cgi?search
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> >
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/help.cgi?search
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> >
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/