[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: DHS data
Friedrich Huebler <firstname.lastname@example.org>
Re: st: RE: DHS data
Thu, 5 Nov 2009 12:15:37 -0500
The command used by Bola to generate a household ID (gen hhid1 = v001
+ v002) does not create unique identifiers. Take the following
1. v001 = 1, v002 = 11
2. v001 = 11, v002 = 1
v001 + v002 is 12 for both observations. -concat- does also not create
unique identifiers with the given data. With -concat-, the new
variable contains the value 111 for both observations.
The datasets for households and children from DHS surveys contain the
same identifying information and it is not necessary to create new
variables. In the household recode file, the cluster number is in
variable hv001 and the household number in variable hv002. In the
children's recode file, the cluster number is in variable v001 and the
household number in variable v002.
To merge the data, rename variables v001 and v002 in the children's
file to hv001 and hv002, sort both datasets by hv001 and hv002 and
save them. You can then open the household file and merge it with the
children's file with this command:
. merge 1:m hv001 hv002 using aokr51fl.dta
If you don't have Stata 11 but an older version, merge the files with
. merge hv001 hv002 using aokr51fl.dta
On Thu, Nov 5, 2009 at 9:49 AM, Nick Cox <email@example.com> wrote:
> My guess is that you are losing uniqueness in your concatenation of
> See for example the thread starting
> An alternative approach is to work with the string identifier
> egen hhid1 = concat(v001 v002)
> Bolanle Bukoye
> I am writing to inquire if you could help me with some issues that I am
> having with merging the dhs data. I am using the data from Angola and
> Tanzania for my thesis. In merging the child and household data for
> Angola, I keep getting error messages on stata.
> I used the household dataset as the base dataset and cluster number and
> household number as the key variables as suggested on the DHS website.
> My stata code is
> . gen hhid1 = v001 + v002
> . sort hhid1
> . merge unique using "C:\Users\bola\Documents\Fall
> 2009\Thesis\AOKR51FL.DTA", uniqmaster
> While the household data set is open, I use the merge drop down menu-->
> select one to many key variables, choose child data set and select the
> hhid1 variable that I just created.
> Here is the error message that I received on stata : "variable hhid1
> does not uniquely identify observations in the master data"
* For searches and help try: