Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: fuzzy merge problem


From   "Dimitriy V. Masterov" <dvmaster@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: fuzzy merge problem
Date   Wed, 22 Sep 2010 11:53:07 -0400

On Wed, Sep 22, 2010 at 9:56 AM, Anders Alexandersson
<andersalex@gmail.com> wrote:
> For the user-written command -reclink-, it seems that the id variable
> must not be in the varlist.
> For your example, I would create an id variable in both datasets, for
> example, -gen id = _n-, and then run
> . reclink county using ".\ihs_counties.dta", idmaster(id) idusing(id) gen(match)
<snip>

I tried this and it worked like a charm!

There was one issue with counties that have names like "HILLSBOROUGH
(M SPLI, NH", which gave the bigram part of reclink some trouble with
unmatched parentheses. Inserting the parentheses fixed the problem.

Scott's solution seems to work very well for a small number of
counties. I was not able to implement it due to the constraint on
local length.

Many thanks to Scott and Anders for saving me many hours of nasty coding.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index