Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Merging datasets with non-unique identifiers


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Merging datasets with non-unique identifiers
Date   Wed, 23 Nov 2011 10:05:42 +0000

I don't understand what your precise question is.

1. Are you saying that you don't understand what -merge- is telling
you? What it says seems clear and correct enough.

2. Merging of File4 worked and of File5 didn't work, but you don't say
what difference there is between their structures. File4 is just
described as "individual".

3. Contrary to advice on this list, you are not giving any of the
Stata commands that you used. In particular, what was the -merge-
command that failed for File5?

4. In total, what kind of structure do you expect for the final data
file? What would define a typical observation? -merge- is a marvellous
command but it can't combine all kinds of data file unless they can be
mapped to a common structure.

Nick

On Wed, Nov 23, 2011 at 9:46 AM, Mary Ann Cruz Bautista
<maryann.bautista@duke-nus.edu.sg> wrote:


> I need to merge 6 datasets (from a survey which did not provide a coding manual):
>
> File1 Household level
> File2 Household level
> File3 Household level
> File4 Individual
> File5 Individual
>
> I only managed to merge files 1-4 using Household ID.
>
> File5 contains data with household ID and child number (this section of the questionnaire was answered by an individual from a selected household, but the responses do not refer to the respondent but to the child being reported about). How can I merge these datasets?
>
> File5 has non-unique identifiers with several observations having the same household ID.
>
>    +----------------------------------------------+
>     | hhid childno   q5210   q5211   q5212   q5213 |
>     |----------------------------------------------|
>  1. |    1       1       .       .       .       . |
>  2. |    1       2       .       .       .       . |
>  3. |    1       3       .       .       .       . |
>  4. |    1       4       .       .       .       . |
>  5. |    1       5       .       .       .       . |
>     |----------------------------------------------|
>  6. |    1       6       .       .       .       . |
>  7. |    1       7     Yes      No      No      No |
>  8. |    1       8     Yes      No      No      No |
>  9. |    1       9       .       .       .       . |
>  10. |    2       1       .       .       .       . |
>     |----------------------------------------------|
>  11. |    2       2       .       .       .       . |
>  12. |    2       3       .       .       .       . |
>  13. |    2       4       .       .       .       . |
>  14. |    2       5      No       .      No      No |
>  15. |    2       6       .       .       .       . |
>     |----------------------------------------------|
>
> Merging File5 with the other files prompted that Household ID is not a unique identifier.
>
>                [variable hhid does not uniquely identify observations in the master data]
>
> I'm an inexperienced Stata user and I'd be glad to get some help from more experienced people here. Thank you for accommodating my question.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index