[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: DHS Ghana variable construction question

From   Tharshini Thangavelu <>
Subject   st: DHS Ghana variable construction question
Date   Sat, 25 Jul 2009 15:47:55 +0200 (CEST)


I have few things that I wonder in the DHS dataset. 

Question 1.
Responding first to Friedrich Hueblers answer 2009-06-22. 
I tried as you said inorder to get the age of mother and age of father. But
those variables seems weird to me. Some of the values indicate 2, which in
impossible. None of mothers or fathers' age can actually be 2 years. There are
variables indicating partners age (v730)and mothers' age (v447a). I just don't
understand why the two variables created according to the following command:

by hhid: gen mage = hv105[hv112]
by hhid: gen fage = hv105[hv114]

doesn't give the same value in years as in the v730 and v447a. Normally this
should be the case, or is it?

Question 2. - Disposable household income variable
I would like to create a variable for disposable household income. This variable
doesn't exit in the DHS datasample but I would like to use a proxy variable that
is available in the data. The suggested proxy variables are; 

.wealth index hv270 (indicates 1-5 level, where 1 is the poorest and 5 is richest. 
.respondens'currently working v714 (the respondent consist of only women which
then will not indicate a good proxy for household income)

Other potentially proxy variables are:
. partners occupation v704 v705
. respondents occupation  v716 v717

I don't know which variable that can in an efficient and in a consistent way
show the disposable household income variable.

question 3. - Nr of siblings 

I would like to create, if possible nr of siblings for children under 5 (my
dependent variable)

How can I create this variable. I have looked if there is variable for nr of
siblings. However, looking at the data sample closely the variables 
. Nr of household member h009
. Nr of children under five years h014 

My reasoing for creation of nr of siblings is the following: looking closely at
these two variables shows the following: 

Ex: in row nr5. the h009 denotes 5 and h014 denotes 2. Thus, this particular
household, incorporates 5 household members with 2 children under five. The one
member that is left, who is this person? Is it sibling as I am assuming or
another family member such as a relative. Further more, I don't know if both
parents are alive in the household. In order to check if both parents are alive
I take on this method. 

sort hhid hvidx

by hhid: gen mother = hv010[hv112]
by hhid: gen father = hv011[hv114]

The hv010 and hv011 represents nr of eligible women and nr of eligible men in
the household. The hv112 and hv114 denotes, mother and fathers line nr
respectively. Nevertheless, there are two other variables sh11 and sh13 which
also indicates mother and fathers line nr. Does it matter which one I use?

Somehow, it doesn't give me the desired results. Instead I try to combine with
the variables 

. mothers' alive sh10
. fathers' alive sh12

This I just check by edit command. In the end how can I verfiy if the one member
is actually a sibling? Because this is the variable that I am looking for.

So, if someone can enlighten me in these three question. I would be happy.

Best regards

On 2009-06-22, at 16:04, Friedrich Huebler wrote:
> Tharshini,
> In your message of June 11 you described how you are merging the
> household member data with the anthropometric data. When I reproduce
> your approach the data from all household members, adults and
> children, appear in the merged file.
> You write now that "only information for children is kept" but without
> seeing the Stata commands you used it is difficult to understand what
> you are doing differently.
> The household member file lists the line number of the mother and
> father of each child up to 14 years of age in the variables hv112 and
> hv114. You can use this line number to create variables with the
> parents' age, level of education, etc. As an example, assume you want
> to identify the parents' ages. First, sort the data by household ID
> and household member line number.
> . sort hhid hvidx
> Next, generate variables with the mother's and father's age.
> . by hhid: gen mage = hv105[hv112]
> . by hhid: gen fage = hv105[hv114]
> Two FAQs on the Stata website provide information related to your problem.
> "How do I create a variable recording whether any members of a group
> (or all members of a group) possess some characteristic?"
> "How do I create variables summarizing for each individual properties
> of the other members of a group?"
> Friedrich
> On Sun, Jun 21, 2009 at 6:43 AM, Tharshini
> Thangavelu<> wrote:
>> Hi all,
>> I have  a specific question concerning the Ghana DHS survery. I am trying to
>> clean the dataset so to analysis using a 2SLS method and of IV method. My
>> dependent variable will be the anthropometric variables controlling for socio
>> economic factors, mothers education, fathers education ect.
>> Basically my question is : how does paternal (fathers and mothers) education
>> improve the child health in Ghana?
>> I know how to merge but is still confusing! I am trying as the information given
>> by DHS, that is. I merge first the anthropometric data set with the household
>> member data. Then, I merge the resulting file with the individual data
>> to finally merge it with the child recode data file.
>> The first merging (anthropometric and household member data) only information
>> for children is kept,which I think is correct.
>> Ex. HV105 "the age of household member" gives information about the age of all
>> the household member. When merging the anthropometric data with the household
>> member data, as intuitive says HV105 excludes all age above 5 years.
>> In order to answer my problem question, I need mothers' fathers' age, education
>> level, their height and weight and other variables. So, What I basically want is
>> to locate each child with their parents.
>> Following the information for merging given by DHS, I should now merge the
>> resulting file with the women file. But this doesn't give me completely the
>> result that I am expecting. That is, I need womens age, height and weight. I
>> know for sure that these variables exist but when merging, they are excluded.
>> If someone has worked with DHS data set perhaps can give me some advise.
>> I do one-to-many merging but perhaps I should use many-to-one merging but this
>> doesn't seem reasonable.
>> Regards
>> Tharshini
>> --
>> Tharshini THANGAVELU
>> Forskarbacken 8 / 101
>> 114 16 Stockholm
>> Sweden
>> Phone +46 (0)735 53 43 90
>> E-mail
> *
> *   For searches and help try:
> *
> *
> *

Forskarbacken 8 / 101
114 16 Stockholm
Phone +46 (0)735 53 43 90

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index