Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Working with DHS for Ghana Year 2003


From   Friedrich Huebler <fhuebler@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Working with DHS for Ghana Year 2003
Date   Thu, 11 Jun 2009 13:48:28 -0400

Tharshini,

You renamed the household ID and line number variables to "clusternr"
and "linenr" in both datasets. This is correct. It is also correct to
sort both datasets by clusternr and linenr so that they can be merged,
using clusternr and linenr as the variables that identify matching
observations. You do not have to delete missing values.

Friedrich

On Thu, Jun 11, 2009 at 12:47 PM, Tharshini
Thangavelu<thth4658@student.su.se> wrote:
> Dear Stata users,
>
> I am using of Ghana year 2003 DHS and want to merge the data files. I have
> selected a number of variables by using the software program SELECT and now want
> to merge the files. According to the DHS this is possible but following there
> instructions given by the DHS doesn't work. I have searched for merging in stata
> and it seems very intuitive but I need id variables that are identical in all
> the files which is not the case. In the files that I use they are identified
> differently.
>
> Ex. I want to merge the height and weight file with the household member recode
> file. I should use HV001 (cluster nr) and HWLINE (Line nr) from the height and
> weight file with HHID(cluster nr)and HVIDX (line nr) from the members recode
> file to merge with the
> household member data.
>
> But there is two id variable for each file and the stata cannot identify when I
> type the following command:
>
> use height
> su
> sort hwhhid hwline
> clear
>
> use housemem
> su
> sort hhid hvidx
> merge hhid hvidx using height.
>
> This doesn't work simply because the id variables are not the same. So I have
> instead renamed the four id variables and then merged them together.
>
> use height
> rename hwhhid clusternr
> renamne hwline linenr
>
> use housemem
> rename hhid clusternr
> rename hvidx linenr
>
> Is this a correct way to do it? And is it correct to sort the two id variables
> and merge the data files with the two id variables. Do I need to delete all the
> missing values before merging?
>
>
> Thanks
> / Tharshini
> --
> Tharshini THANGAVELU
> Forskarbacken 8 / 101
> 114 16 Stockholm
> Sweden
> Phone +46 (0)735 53 43 90
> E-mail thth4658@student.su.se
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index