Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Merging datasets with non-unique identifiers


From   Teresio Poggio <terlist@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Merging datasets with non-unique identifiers
Date   Wed, 23 Nov 2011 11:14:30 +0100

Dear Mary Ann,

the files have a hierarchical data structure (each individual in files
4 & 5 is linked to one only household, but there may be more
individuals in these files belonging to the same households).

If you are interested in doing your analysis at an individual level
(children) you can merge household level variables with the individual
level ones by using merge many-to-one, instead of merging one-to-one.
You can use the household id as a key variable. See help merge for
details.

If you are interested in doing your analysis at an household level you
first need to summarize individual level information at an household
level using collapse (help collapse for details) and a function that
is useful to your purpose (count if you need a *number of children*
variable; max if you need a *age of the elder child* variable, etx).
You can then merge one-to-one the resulting data file with the
household level ones.

hth
Best regards

Teresio


On Wed, Nov 23, 2011 at 10:46 AM, Mary Ann Cruz Bautista
<maryann.bautista@duke-nus.edu.sg> wrote:
> Dear all,
>
> I need to merge 6 datasets (from a survey which did not provide a coding manual):
>
> File1 Household level
> File2 Household level
> File3 Household level
> File4 Individual
> File5 Individual
>
> I only managed to merge files 1-4 using Household ID.
>
> File5 contains data with household ID and child number (this section of the questionnaire was answered by an individual from a selected household, but the responses do not refer to the respondent but to the child being reported about). How can I merge these datasets?
>
> File5 has non-unique identifiers with several observations having the same household ID.
>
>    +----------------------------------------------+
>     | hhid childno   q5210   q5211   q5212   q5213 |
>     |----------------------------------------------|
>  1. |    1       1       .       .       .       . |
>  2. |    1       2       .       .       .       . |
>  3. |    1       3       .       .       .       . |
>  4. |    1       4       .       .       .       . |
>  5. |    1       5       .       .       .       . |
>     |----------------------------------------------|
>  6. |    1       6       .       .       .       . |
>  7. |    1       7     Yes      No      No      No |
>  8. |    1       8     Yes      No      No      No |
>  9. |    1       9       .       .       .       . |
>  10. |    2       1       .       .       .       . |
>     |----------------------------------------------|
>  11. |    2       2       .       .       .       . |
>  12. |    2       3       .       .       .       . |
>  13. |    2       4       .       .       .       . |
>  14. |    2       5      No       .      No      No |
>  15. |    2       6       .       .       .       . |
>     |----------------------------------------------|
>
> Merging File5 with the other files prompted that Household ID is not a unique identifier.
>
>                [variable hhid does not uniquely identify observations in the master data]
>
> I'm an inexperienced Stata user and I'd be glad to get some help from more experienced people here. Thank you for accommodating my question.
>
> Best,
> Mary
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
____________________________________________________
dr. Teresio Poggio
LaboR - Dipartimento di Sociologia e ricerca sociale
Università degli studi di Trento
Via Verdi, 26
38100 Trento, Italy
Tel   +39 0461/881406
fax:  +39 0461/881348

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index