Thanks Stas Kolenikov. As per Stas Kolenikov's advice I have added labels, summary statistics of the relevant vars. Hi, I am using the following commands to set up DHS (Demographic and Health Survey data) data for analysis gen psu = v021 gen strata = v022 gen sampwt = v005/1000000 //as per DHS instruction// svyset psu [pw = sampwt], strata(strata) Where, v005 sample weight v021 primary sampling unit v022 sample stratum number . sum v005 v021 v022 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- v005 | 11440 1000000 479282.7 55728 2707592 v021 | 11440 223.3237 163.2414 1 550 v022 | 11440 89.80385 51.64129 1 177 I have two questions: 1. Is this the right way to set up data ? 2. For the data set I am using, for one year, var V022 is missing. What other var(s) can I consider to use instead of V022 On Mon, Jul 20, 2009 at 9:52 AM, Stas Kolenikov<skolenik@gmail.com> wrote: > Nikh, this is not terribly informative -- give the labels of the > variables. (As the FAQ of the list says, don't assume that everybody > knows your data and your literature as well as you do.) You may not > like the idea of having weights like 10,000 if you are used to think > about the weight variable as something close to 1, or maybe something > close to 1/n. But if you want to estimate the total number of people > in the country that don't have access to clean water, those 10,000 > weights are the right ones to use: the weight of 1 is going to give > you the total number of people in the sample that don't have access to > clean water, and you cannot put that sort of stuff into your country > report. Check DHS documentation again on the survey settings. > > To my knowledge, stratification does not change in DHS from year to > year, so you can keep strata ID from other years if you can match the > clustdrs. If you have any new PSUs, it may not be possible to > determine where they are coming from though; you could create a > separate stratum for all of them. Finally, you can ignore > stratification whatsoever, and lose some precision/efficiency with > that. > > On Mon, Jul 20, 2009 at 10:21 AM, nikh 2000<nikh.2000@gmail.com> wrote: >> Hi, I am using the following commands to set up DHS (Demographic and >> Health Survey data) data for analysis >> >> gen psu = v021 >> gen strata = v022 >> gen sampwt = v005/1000000 >> >> svyset psu [pw = sampwt], strata(strata) >> >> I have two questions: >> >> 1. Is this the right way to set up data ? >> 2. For the data set I am using, for one year, var V022 is missing. >> What other var(s) can I consider to use instead of V022 > > > > -- > Stas Kolenikov, also found at http://stas.kolenikov.name > Small print: I use this email account for mailing lists only. > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

