Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: nearmrg for strings (titles)


From   Daniel Feenberg <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: nearmrg for strings (titles)
Date   Tue, 30 Aug 2011 08:13:11 -0400 (EDT)



On Tue, 30 Aug 2011, Hoecher, Michaela (0613xxx) wrote:

Hello!

I would like to merge two datasets (variables: title, date, publisher).
The problems is, that strings (tiltes of a book), that are not absolutely the same sould be merged/matched.
- Does it make sense to use nearmrg for this?
- In which way are strings merged/matched?
- What would you recommend me?

Some time ago I wrote a program to help a clerical do this rapidly. The program finds up to 5 likely matches, and lets the operator select the best match. I used it once to go through a few thousand journal article matches but it hasn't been used since. There is documentation at:

  http://www.nber.org/imatch

and I would be interested in having a few more users. It is interactive, but it isn't a GUI program - it runs from the command line and the operator makes selections with the keyboard.

Note that most commercial code to do matching is oriented towards
address matching, and won't be particularly adept at author/title
matching.

Dan Feenberg


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index