[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Question about match merge

From   "Sergiy Radyakin" <[email protected]>
To   [email protected]
Subject   Re: st: Question about match merge
Date   Wed, 5 Sep 2007 17:39:17 +0200

Just a word of caution for the data users: many licences for commonly
used datasets explicitly forbid any attempts to identify particular
individuals, match individuals across datasets or analysis of such
matched data, so before you proceed, make sure that this will not end
up in a legal trouble.

And after you are absolutely sure, you may want to have a look at this
(it is not a Stata module, but it might do the trick)

Best regards,
    Sergiy Radyakin

On 9/2/07, Scott Talkington <[email protected]> wrote:
> I seem to recall that there's an algorithm that is able to crosswalk
> databases by matching names combined with other secondary keys, such as
> zip code, and that the algorithm will produce a "probability of match"
> for the given ID.  I used to conduct match merges based on name and zip
> in an earlier version of Stata, but it was quite cumbersome to deal with
> misspellings, typos (common transpositions of letters or numbers, etc.),
> all caps vs lower case, prefixes and suffixes, titles, middle initial
> versus middle name, etc, etc..  What I'd like to know is whether a more
> sophisticated match/merge based on primary and secondary keys or IDs has
> been developed, and if so some documentation on how it works.  Also,
> would it deal with very common names, such as "David Jones" vs less
> common names, like "Horace Vilochkek" or size of the database,  adjust
> the probability of match accordingly.  Or is all of this just some pipe
> dream I happend to think up when I was under the influence?
> I'll also try to scrounge up something on the FAQ database, but most of
> my text documentation on Stata 9.2 is stored in boxes since I'm in the
> midst of a move, and I need at least some idea of the capability of such
> a match/merge within the week.
> Scott Talkington, PhD
> [email protected]
> *
> *   For searches and help try:
> *
> *
> *
*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index