Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: problem with merge

From   "Eric Booth" <>
To   <>
Subject   st: RE: problem with merge
Date   Thu, 8 Apr 2010 10:10:09 -0500


First the standard questions:   What command did you try and what version of Stata are you using? 
Do you have any options after the comma (e.g. "force", "keepusing()") in the merge command?  You mention that you checked the formats and the blanks, but just to be sure, did you try  "replace A = trim(A) "  and looking at the format of the merge variable with describe?  Also, make sure you are merging on the just the variables that link up the data.   

When this happens to me, I usually find that it is an error in my -merge- command or something leading up to it.  


If you've tried all this and you think it's all in good order, I would try a couple of things:

1) just -keep- your merge/match variable(s) from each dataset and try the merge again
2) try -joinby- with the "unmatched(both)" option to see if you get any of the expected matches
3) try -reclink- from SSC for a fuzzy merge and look at the records you think should have linked up to see what their matching score is--given your description, this approach might be the quickest way to diagnose your issue.
4) Finally, you mention that "Investors are stored by their names therefore those are and must be string variables.", but you don't have to merge on the string variables.  You could create an unique id/key number for each investor name in the two datasets and then link on the id number instead of the string var.

~ Eric
Eric A. Booth
Public Policy Research Institute
Texas A&M University

-----Original Message-----
From: on behalf of Stefano Bonini
Sent: Thu 4/8/2010 9:50 AM
To: statalist
Subject: st: problem with merge
Hi folks

I have an inextricable issue with merge:
I have two datasets: "A" with a list of deals for which I know the investor (investors are duplicated because one investor can easily invest in more than one deal) and B with the full population of investors (hence unique obs) and their characteristics, such as location, assets, etc. Investors are stored by their names therefore those are and must be string variables. I also have missing obs in A (my master) because for some deals I don't  know who invested.
I am trying to merge the two sets to add the investor's characteristics (contained in B) to each deal. I confidently tried to merge them but the procedure doesn't match. Visual inspection confirms me that there should be matches (I mean, _merge=3) but the command returns just _merge=1 or 2.
I checked the formats, blanks and everything i could but I just can't make it work.
Any clue on what may cause this weird behavior? 

Stefano Bonini
Visiting Associate Professor
Department of Finance
New York University
Stern School of Business
44W 4th St. New York, NY

Ph. +1 212 998 0305
*   For searches and help try:


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index