Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Analysis on subset of Demographic and Health Survey (DHS) data


From   Friedrich Huebler <fhuebler@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Analysis on subset of Demographic and Health Survey (DHS) data
Date   Sun, 27 Oct 2013 23:09:33 -0400

French,

All DHS datasets should contain the same information, independent of
the file format (flat, hierarchical, etc.). Quote from the MEASURE DHS
website: "Most researchers use the flat file designed for the
statistical software that they intend to use for analysis."

https://www.measuredhs.com/data/File-Formats.cfm

Friedrich

On Sun, Oct 27, 2013 at 2:00 PM, french Smith <french.c.smith@gmail.com> wrote:
> Dear Friedrich,
>
> Apologies for not having noted my Stata version, and for other
> omissions of important info.
>
> My Stata output is below.
>
> I use version 10.
>
> I expected 8,081 cases because page 14 of the report says there were
> 8,081 children aged 0-4.
>
> Thank you for reminding me that Stata is case sensitive - I'm happy
> that there is an easy fix to that issue.
>
> When you analyse DHS, do you analyse the same file type I was using? I
> saw that in one of the posts you note you prefer using the flat file.
>
> Best,
>
> French
>
>
> . set mem 450m
> (460800k)
>
> .
> . set matsize 800
>
> .
> . use TZKR62DT/TZKR62FL.DTA
>
> . *v022 is sample stratum number
>
> . gen stratid = v022
>
> .
> . *v021 is PSU
>
> . gen psu = v021
>
> .
> . *v005 is sample weight
>
> . gen weight = v005/1000000
>
> .
> . svyset psu [pw=weight], strata(stratid)
>
>       pweight: weight
>           VCE: linearized
>   Single unit: missing
>      Strata 1: stratid
>          SU 1: psu
>         FPC 1: <zero>
>
> . *Just to compare totals. Should get 8,081 for Tanzania DHS 2010 not 8,023!
>
> . tab stratid
>
>     stratid |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           1 |         30        0.37        0.37
>           2 |         83        1.03        1.41
>           3 |         38        0.47        1.88
>           4 |         41        0.51        2.39
>           5 |         63        0.79        3.18
>           6 |         58        0.72        3.90
>           7 |        194        2.42        6.32
>           8 |         27        0.34        6.66
>           9 |         45        0.56        7.22
>          10 |         39        0.49        7.70
>          11 |         40        0.50        8.20
>          12 |         46        0.57        8.77
>          13 |         57        0.71        9.49
>          14 |         39        0.49        9.97
>          15 |         44        0.55       10.52
>          16 |         40        0.50       11.02
>          17 |         36        0.45       11.47
>          18 |         16        0.20       11.67
>          19 |         83        1.03       12.70
>          20 |         54        0.67       13.37
>          21 |         27        0.34       13.71
>          23 |         15        0.19       13.90
>          24 |        278        3.47       17.36
>          25 |         34        0.42       17.79
>          26 |         84        1.05       18.83
>          27 |        266        3.32       22.15
>          28 |        176        2.19       24.34
>          29 |        112        1.40       25.74
>          30 |        181        2.26       27.99
>          31 |        188        2.34       30.34
>          32 |        192        2.39       32.73
>          33 |         13        0.16       32.89
>          34 |        200        2.49       35.39
>          35 |        154        1.92       37.31
>          36 |        213        2.65       39.96
>          37 |        189        2.36       42.32
>          38 |        233        2.90       45.22
>          39 |        312        3.89       49.11
>          40 |        445        5.55       54.66
>          41 |        310        3.86       58.52
>          42 |        292        3.64       62.16
>          43 |        488        6.08       68.24
>          44 |        300        3.74       71.98
>          45 |        363        4.52       76.51
>          46 |        398        4.96       81.47
>          47 |        264        3.29       84.76
>          48 |        327        4.08       88.83
>          49 |        239        2.98       91.81
>          50 |         56        0.70       92.51
>          51 |        331        4.13       96.63
>          52 |        270        3.37      100.00
> ------------+-----------------------------------
>       Total |      8,023      100.00
>
> .
> . *keep if complete response
>
> . keep if v015==1
> (0 observations deleted)
>
> . *keep if child lives with respondent
>
> . keep if b9==0
> (824 observations deleted)
>
>
> On Sun, Oct 27, 2013 at 9:48 AM, Friedrich Huebler <fhuebler@gmail.com> wrote:
>> French,
>>
>> I have to retract my comment about the filename. The -use- command is
>> valid if you have a folder TZKR62DT that contains the file
>> TZKR62FL.DTA.
>>
>> Sorry for my mistake,
>>
>> Friedrich
>>
>> On Sun, Oct 27, 2013 at 9:22 AM, Friedrich Huebler <fhuebler@gmail.com> wrote:
>>> French,
>>>
>>> Please have a look at the Statalist FAQ (link at the bottom of this
>>> message). Some excerpts:
>>>
>>> "The current version of Stata is 13.0. Please specify if you are using
>>> an earlier version"
>>>
>>> The fact that you use -set mem- indicates that you have Stata 11 or older.
>>>
>>> "Say exactly what you typed and exactly what Stata typed (or did) in response."
>>>
>>> I assume that you did not type "use TZKR62DT/TZKR62FL.DTA" because
>>> that is not a valid filename in most operating systems. You also don't
>>> explain why the command "keep if B9==0" is not recognized because you
>>> omitted the output from Stata.
>>>
>>> Now to your questions. The data for children under 5 from the Tanzania
>>> DHS 2010 has only 8023 observations. It is not clear why you expect
>>> 8081 observations. The survey report, which you cite as reference, has
>>> 480 pages.
>>>
>>> The command -keep if B9==0- will yield an error message because the
>>> variable B9 doesn't exist in the data. Stata is case-sensitive and the
>>> correct variable name is b9.
>>>
>>> Friedrich
>>>
>>> On Sat, Oct 26, 2013 at 12:53 PM, french Smith <french.c.smith@gmail.com> wrote:
>>>> Dear STATA crowd,
>>>>
>>>> I wish to analyze the Demographic and Health Surveys (DHS) data on
>>>> children under five.
>>>>
>>>> I have set up my analysis as follows, using at Tanzania DHS 2010 as a
>>>> starting point. But I have two questions:
>>>>
>>>> 1. Why do I have 8,023 cases not the 8,081 that the report (the report
>>>> and dataset are available at
>>>> http://www.measuredhs.com/what-we-do/survey/survey-display-345.cfm)
>>>> says exit?
>>>>
>>>> 2. How do I restrict the analysis to V465, which was a question asked
>>>> only of respondents with a youngest child under five living with them?
>>>>
>>>> Maybe questions, sorry, but my STATA mind needs jogging!
>>>>
>>>> Thanks!
>>>>
>>>> French
>>>>
>>>>
>>>> set mem 450m
>>>>
>>>> set matsize 800
>>>>
>>>> use TZKR62DT/TZKR62FL.DTA
>>>>
>>>> *v022 is sample stratum number
>>>> gen stratid = v022
>>>>
>>>> *v021 is PSU
>>>> gen psu = v021
>>>>
>>>> *v005 is sample weight
>>>> gen weight = v005/1000000
>>>>
>>>> svyset psu [pw=weight], strata(stratid)
>>>>
>>>> *Just to compare totals. Should get 8,081 for Tanzania DHS 2010 not 8,023!
>>>> tab stratid
>>>>
>>>> *keep if complete response
>>>> keep if v015==1
>>>>
>>>> *keep if child lives with respondent. But this command isn’t recognized!
>>>> keep if B9==0

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index