Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Merging only the common elements, and dropping duplicates

From	"Michael N. Mitchell" <[email protected]>
To	[email protected]
Subject	Re: st: Merging only the common elements, and dropping duplicates
Date	Wed, 16 Jun 2010 00:28:46 -0700

I responded too soon!

It looks like you do not have a "1 to 1" merge on your hands, it looks more like a "manyto many" merge, where many person records can match to many account records.

A "many to many" merge can be accomplished with the -joinby- command, but I don't thinkthat will give you the results that you seek. I think that we might benefit by hearingmore about the "many" to "many" nature of this merge, and whether one of these datasetsshould have duplicates removed prior to merging, to thus create a 1:m or m:1 merge.


I hope that helps,

Michael N. Mitchell
Data Management Using Stata      - http://www.stata.com/bookstore/dmus.html
A Visual Guide to Stata Graphics - http://www.stata.com/bookstore/vgsg.html
Stata tidbit of the week         - http://www.MichaelNormanMitchell.com



On 2010-06-16 12.22 AM, Michael N. Mitchell wrote:

Dear George

Using Stata 11.0 and above, you can do this...

. use database1, clear
. merge 1:1 account using database2, keep(match)

and the resulting merged file will contain only matches.

I hope this helps,

Michael N. Mitchell
Data Management Using Stata - http://www.stata.com/bookstore/dmus.html
A Visual Guide to Stata Graphics - http://www.stata.com/bookstore/vgsg.html
Stata tidbit of the week - http://www.MichaelNormanMitchell.com



On 2010-06-16 12.13 AM, George Chioran wrote:

Dear statalisters,

I am trying to merge 2
databases to match persons with accounts; however I discovered that
when I do
this, the final database contains a lot of duplicates.
Database 1 looks like
this. I sort it by account.

Person Account
2 311
2 311
2 311
2 311
2 413
2 413
2 413

Database 2 looks like
this. I sort it by account.

Account B_S
311 b
311 b
311 b

To merge them, I load
database 1 and merge it using database 1. But when I do this, I would
like to
have a final database that contains only the common observation of
both. But I get database 3
that looks like this.

Person Account B_S
2 311 b
2 311 b
2 311 b
2 311 b
2 413 .
2 413 .
2 413 .

Is there a way to get
just the common observations? I would like my final database to have
just:
Person Account B_S

2 311 b
2 311 b
2 311 b

Thank you,
George



*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: Plotting 3 variables on the Y axis, and 1 time variable on the x axis.
  - From: George Chioran <[email protected]>

References:
- st: Making a local with no assigned value zero
  - From: George Chioran <[email protected]>
- st: RE: Making a local with no assigned value zero
  - From: "Nick Cox" <[email protected]>
- Re: st: RE: Making a local with no assigned value zero
  - From: George Chioran <[email protected]>
- RE: st: RE: Making a local with no assigned value zero
  - From: "Nick Cox" <[email protected]>
- st: Merging only the common elements, and dropping duplicates
  - From: George Chioran <[email protected]>
- Re: st: Merging only the common elements, and dropping duplicates
  - From: "Michael N. Mitchell" <[email protected]>

Prev by Date: Re: st: Merging only the common elements, and dropping duplicates
Next by Date: Re: st: use of brace inside of macro
Previous by thread: Re: st: Merging only the common elements, and dropping duplicates
Next by thread: st: Plotting 3 variables on the Y axis, and 1 time variable on the x axis.
Index(es):
- Date
- Thread