[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
nikh 2000 <nikh.2000@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: DHS sampling weight command |

Date |
Mon, 20 Jul 2009 16:49:51 -0600 |

Thanks Emma, this is very helpful. Nikh On Mon, Jul 20, 2009 at 2:00 PM, <Emma.Slaymaker@lshtm.ac.uk> wrote: > Dear Nikh, > > I wouldn't rely on the information in v022 to determine the strata. To find the appropriate strata for a DHS you generally have to check the survey report to see how the strata were defined. Most surveys are stratified by province (v024) and urban/rural (v025) so a combination of those two variables is appropriate. Others are done (very) differently. There is sometimes additional information in the dataset documentation (in the zip file) and even extra variables in some datasets. The information given in v022 is often a relic of the old DHS data processing system and so not what Stata expects strata to be. It depends on how old the data are. > > Some surveys aren't stratified, in which case you can svyset without strata. If you are combining data from several years you can give all the observations from the unstratified survey the same number so they form one strata. > > The weight variable (v005) should be divided by 1000000 before use because DHS supply it multiplied up to avoid precision problems with different software (you'll notice the label says something about 6 decimals). > > Best wishes, > Emma > > >>>> nikh 2000 <nikh.2000@gmail.com> 20/07/09 17:20 >>> > Thanks Stas Kolenikov. > As per Stas Kolenikov's advice I have added labels, summary statistics > of the relevant vars. > > Hi, I am using the following commands to set up DHS (Demographic and > Health Survey data) data for analysis > > gen psu = v021 > gen strata = v022 > gen sampwt = v005/1000000 //as per DHS instruction// > svyset psu [pw = sampwt], strata(strata) > > Where, > v005 sample weight > v021 primary sampling unit > v022 sample stratum number > > . sum v005 v021 v022 > Variable | Obs Mean Std. Dev. Min Max > -------------+-------------------------------------------------------- > v005 | 11440 1000000 479282.7 55728 2707592 > v021 | 11440 223.3237 163.2414 1 550 > v022 | 11440 89.80385 51.64129 1 177 > > > I have two questions: > 1. Is this the right way to set up data ? > 2. For the data set I am using, for one year, var V022 is missing. > > What other var(s) can I consider to use instead of V022 > > > > > > > > On Mon, Jul 20, 2009 at 9:52 AM, Stas Kolenikov<skolenik@gmail.com> wrote: >> Nikh, this is not terribly informative -- give the labels of the >> variables. (As the FAQ of the list says, don't assume that everybody >> knows your data and your literature as well as you do.) You may not >> like the idea of having weights like 10,000 if you are used to think >> about the weight variable as something close to 1, or maybe something >> close to 1/n. But if you want to estimate the total number of people >> in the country that don't have access to clean water, those 10,000 >> weights are the right ones to use: the weight of 1 is going to give >> you the total number of people in the sample that don't have access to >> clean water, and you cannot put that sort of stuff into your country >> report. Check DHS documentation again on the survey settings. >> >> To my knowledge, stratification does not change in DHS from year to >> year, so you can keep strata ID from other years if you can match the >> clustdrs. If you have any new PSUs, it may not be possible to >> determine where they are coming from though; you could create a >> separate stratum for all of them. Finally, you can ignore >> stratification whatsoever, and lose some precision/efficiency with >> that. >> >> On Mon, Jul 20, 2009 at 10:21 AM, nikh 2000<nikh.2000@gmail.com> wrote: >>> Hi, I am using the following commands to set up DHS (Demographic and >>> Health Survey data) data for analysis >>> >>> gen psu = v021 >>> gen strata = v022 >>> gen sampwt = v005/1000000 >>> >>> svyset psu [pw = sampwt], strata(strata) >>> >>> I have two questions: >>> >>> 1. Is this the right way to set up data ? >>> 2. For the data set I am using, for one year, var V022 is missing. >>> What other var(s) can I consider to use instead of V022 >> >> >> >> -- >> Stas Kolenikov, also found at http://stas.kolenikov.name >> Small print: I use this email account for mailing lists only. >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: DHS sampling weight command***From:*nikh 2000 <nikh.2000@gmail.com>

**Re: st: DHS sampling weight command***From:*Stas Kolenikov <skolenik@gmail.com>

**Re: st: DHS sampling weight command***From:*nikh 2000 <nikh.2000@gmail.com>

**Re: st: DHS sampling weight command***From:*<Emma.Slaymaker@lshtm.ac.uk>

- Prev by Date:
**st: RE: RE: RE: RE: Nested loops by observation** - Next by Date:
**Re: st: Panel data unbalanced--time as indep variable?** - Previous by thread:
**Re: st: DHS sampling weight command** - Next by thread:
**st: Biprobit** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |