Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: DHS Ghana variable construction question


From   Tharshini Thangavelu <thth4658@student.su.se>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: DHS Ghana variable construction question
Date   Tue, 28 Jul 2009 08:54:42 +0200 (CEST)

Hi Friedrich,

When I downloaded the dataset for Ghana 2003, there was a doc.file in the file
for height and weight. A describtion of how to processed when merging and which
identifying variables to chose in each and every file. I followed this doc.fil  
I merged the file according to the following way;

1.) The height and weight file for children up to 5 years old.
rename HWHHID caseid
rename HWLINE linenr
sort caseid linenr
save weight, replace
clear exit

2.) The household member data includes to many variables to directly upload in
stata, so I used the program "select", where I selected my variables of
interest. Then I uploaded in stata;

use hmr1
rename hhid caseid
rename hvidx linenr
sort caseid linenr
save hmr1, replace

3.) These two files was then merged together (master data = hmr1)

merge caseid linenr using weight

tab _merge

     _merge |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |     22,673       85.23       85.23
          3 |      3,928       14.77      100.00
------------+-----------------------------------
      Total |     26,601      100.00

. keep if _merge ==3
(22673 observations deleted)

. drop _merge

Error message : linenr was byte now int

My own conclusion: Since _merge 3 = 3928 observations which is exactly same
amount of obs. as in the weight file. I concluded the merging was correctly
made. I also tried with the inverse case, i.e. having hmr as my master data.

4.) With this resulting file, I merged it with the individual recode file
(=womens file). Cluster number (clnrhv001), householdnr (hhnr hv002) and
mothers' line nr (lnr hc60) 

In the resulting file, I again renamed the identifying variables
rename HV001 clnr
rename HV002 hhnr
rename hc60  lnr
sort clnr hhnr lnr
save thesis
clear exit

5.)In the individual recode file, just as in the household member recode file, I
used the program "select" to chose the variables and the following identifying
variables were renamed. Cluster number (clnr v001), Household number (hhnr v002)
and Respondent's line number (lnr v003).

use ir1
rename V001 clnr
rename V002 hhnr
rename V003 lnr
sort clnr hhnr lnr
save ir1, replace

6.)Now, I merge the ir1.dta with the thesis.dta

merge clnr hhnr lnr using thesis
tab _merge

     _merge |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |        526        7.48        7.48
          2 |      3,100       44.11       51.59
          3 |      3,402       48.41      100.00
------------+-----------------------------------
      Total |      7,028      100.00

. keep if _merge == 3
(3626 observations deleted)

. drop _merge

Error message: variables clnr hhnr lnr do not uniquely identify observations in
the master data. I hope this will help to solve the problem. 

/ Tharshini




On 2009-07-28, at 06:28, Friedrich Huebler wrote:
> Tharshini,
>
> On June 11 you wrote that you wanted to merge the household member
> file with the height and weight file. In response to your message you
> received advice on how you can merge the data. The table in your
> message of today makes clear that you did not merge the files
> correctly because you only have persons up to 5 years of age. If you
> want more help with this and the other problems you described you have
> to show us your code, as explained in the Statalist FAQ.
>
> http://www.stata.com/support/faqs/res/statalist.html#advice
>
> Friedrich
>
> On Mon, Jul 27, 2009 at 9:42 AM, Tharshini
> Thangavelu<thth4658@student.su.se> wrote:
>>
>> .tab hv105
>>   Age of |
>>  household |
>>    members |      Freq.     Percent        Cum.
>> ------------+-----------------------------------
>>          0 |        772       22.69       22.69
>>          1 |        706       20.75       43.45
>>          2 |        655       19.25       62.70
>>          3 |        689       20.25       82.95
>>          4 |        553       16.26       99.21
>>          5 |         27        0.79      100.00
>> ------------+-----------------------------------
>>      Total |      3,402      100.00
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

-- 
Tharshini THANGAVELU
Forskarbacken 8 / 101
114 16 Stockholm
Sweden
Phone +46 (0)735 53 43 90
E-mail thth4658@student.su.se

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index