Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: problem with merge
"Eric Booth" <email@example.com>
st: RE: problem with merge
Thu, 8 Apr 2010 10:10:09 -0500
First the standard questions: What command did you try and what version of Stata are you using?
Do you have any options after the comma (e.g. "force", "keepusing()") in the merge command? You mention that you checked the formats and the blanks, but just to be sure, did you try "replace A = trim(A) " and looking at the format of the merge variable with describe? Also, make sure you are merging on the just the variables that link up the data.
When this happens to me, I usually find that it is an error in my -merge- command or something leading up to it.
If you've tried all this and you think it's all in good order, I would try a couple of things:
1) just -keep- your merge/match variable(s) from each dataset and try the merge again
2) try -joinby- with the "unmatched(both)" option to see if you get any of the expected matches
3) try -reclink- from SSC for a fuzzy merge and look at the records you think should have linked up to see what their matching score is--given your description, this approach might be the quickest way to diagnose your issue.
4) Finally, you mention that "Investors are stored by their names therefore those are and must be string variables.", but you don't have to merge on the string variables. You could create an unique id/key number for each investor name in the two datasets and then link on the id number instead of the string var.
Eric A. Booth
Public Policy Research Institute
Texas A&M University
From: firstname.lastname@example.org on behalf of Stefano Bonini
Sent: Thu 4/8/2010 9:50 AM
Subject: st: problem with merge
I have an inextricable issue with merge:
I have two datasets: "A" with a list of deals for which I know the investor (investors are duplicated because one investor can easily invest in more than one deal) and B with the full population of investors (hence unique obs) and their characteristics, such as location, assets, etc. Investors are stored by their names therefore those are and must be string variables. I also have missing obs in A (my master) because for some deals I don't know who invested.
I am trying to merge the two sets to add the investor's characteristics (contained in B) to each deal. I confidently tried to merge them but the procedure doesn't match. Visual inspection confirms me that there should be matches (I mean, _merge=3) but the command returns just _merge=1 or 2.
I checked the formats, blanks and everything i could but I just can't make it work.
Any clue on what may cause this weird behavior?
Visiting Associate Professor
Department of Finance
New York University
Stern School of Business
44W 4th St. New York, NY
Ph. +1 212 998 0305
* For searches and help try: