Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: probabilistic record linkage

From   Stas Kolenikov <>
Subject   Re: st: RE: probabilistic record linkage
Date   Tue, 21 Sep 2004 11:02:38 -0400

This is a hot area of the survey research when you get the data from
different sources that are supposed to be on the same individuals, but
due to privacy concerns, you don't have any individual identifiers.
Moreover, identifying information such as the date of birth may have
received some noise of say 5 or 10 days. So instead of doing a fair
-merge-, you would have to guess as to who's who. Of course, you
cannot do that at the individual level for 10000 observations, so all
those probabilistic linkage stuff is a way to stochastically merge the
data sets. I don't know that much about it, just heard a couple of

I doubt anybody has implemented this in Stata. I am sure it can be
done, as the models are not that difficult, it just depends on how
much time / programming resource is available.

On Tue, 21 Sep 2004 13:30:37 +0100, Nick Cox <> wrote:
> D.E. Clark 2004. Perhaps you could add further details
> for those interested.
> Nick
> Adrian Spörri-Fahrni
> > I'm involved in a project where we link different huge
> > (health) data sets.
> > I'm interested in programming this linkage process in stata
> > calculating
> > Bayesian posterior odds (see Clark,D.E.,2004, as an example).
> > Has anyone of
> > you experience with probabilistic record linkage in Stata?
> > I know about the Australian febrl-project and about linkpro
> > in SAS, but I'd
> > prefer having a Stata solution.
> > Thanks for any references or remarks
> *
> *   For searches and help try:
> *
> *
> *

Stas Kolenikov

*   For searches and help try:

© Copyright 1996–2021 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index