Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Reclink: high matching score, but no match
From
Devra Golbe <[email protected]>
To
[email protected]
Subject
Re: st: Reclink: high matching score, but no match
Date
Fri, 24 Jan 2014 17:45:45 -0500
Dorothy and all,
My apologies for not closing this thread properly. Michael solved my
problem in private correspondence, and I failed to report back to the list.
The problem was that in successive runs of my do-file I had managed to
save the idusing variable to the master dataset. Once I dropped it from
the master file reclink ran fine.
Devra
On 1/24/2014 1:43 PM Dorothy Bridges wrote:
Hello everyone (especially Devra and Michael): Was this ever resolved?
I'm having the exact same problem. Code and (partial) output copied
below. I usually use reclink without any problems.
reclink entidad municipio using
"ESDATA/dta/MunicipioLevel/ESDATAMun_Jan2014.dta", ///
idu(idu) idm(idm) gen(match) required(entidad)
listing the output:
entidad municipio
Umunicipio match
ZULIA VALMORE RODRIGUEZ SAN
FRANCISCO 0.9961
NUEVA ESPARTA ANTOLIN DEL CAMPO
SOTILLO 0.9961
YARACUY JOSE ANTONIO PAEZ
BRUZUAL 0.9961
ZULIA LA CANADA DE URDANETA FRANCISCO J
PULG 0.9974
ARAGUA MARIO BRICENO IRAGORRY M.OCUMARE D LA
COSTA 0.9977
ZULIA ROSARIO DE PERIJA
MARA 0.9982
ARAGUA FRANCISCO LINARES ALCANTARA FRANCISCO
LINARES A. 0.9992
BARINAS ALBERTO ARVELO TORREALBA
ZAMORA 0.9995
SUCRE RIBERO
MEJIA 1.0000
CARABOBO BEJUMA
SIFONTES 1.0000
MIRANDA URDANETA RIVAS
DAVILA 1.0000
YARACUY MANUEL MONGE
INDEPENDENCIA 1.0000
DELTA AMACURO CASACOIMA
TINACO 1.0000
SUCRE SUCRE
MONTES 1.0000
NUEVA ESPARTA GOMEZ
GARCIA 1.0000
ARAGUA JOSE ANGEL LAMAS JOSE ANGEL
LAMAS 1.0000
BARINAS CRUZ PAREDES
BARINAS 1.0000
MIRANDA SUCRE
RANGEL 1.0000
MIRANDA SIMON BOLIVAR PUEBLO
LLANO 1.0000
ZULIA PAEZ MACHIQUES
DE P 1.0000
On Wed, Dec 28, 2011 at 10:04 AM, Devra Golbe <[email protected]> wrote:
Michael,
student_name is non-numeric. After some additional data cleaning and the
resulting reduction of the set that needed a fuzzy match reclink succeeded
with student_name as the idusing variable, so my original problem is solved.
But working with a smaller data set, I have an example where the non-numeric
identifier and a numeric identifier fail, but a different numeric identifier
succeeds. I'll send those data and the do-file to you off-list.
Thanks and happy new year.
Devra
On 12/28/2011 11:49 AM Michael Blasnik wrote:
It looks like this is a bug -- is student_name numeric? If not, you
may want to try encoding it and trying again. If that isn't the
problem, it might be best if you either send me the data or a trace
log off-list to see if i can figure it out, but I may not get a chance
to figure it out until after the holidays.
Michael
On Sat, Dec 24, 2011 at 4:10 PM, Devra Golbe<[email protected]> wrote:
I am using Michael Blasnik's reclink (from SSC) to match records. I get
extremely high matching scores, and yet the records do not match. Can
anyone help? My code and relevant output are pasted below.
Thanks and happy holidays,
Devra
******
. sort lname fname
. gen idmaster=_n
.tempfile ps1a
.save `ps1a', replace
. clear
.use roster100f11Sep7.dta
.sort lname fname
.save, replace
.clear
.use `ps1a'
.reclink lname fname using roster100f11Sep7.dta, ///
idmaster(idmaster) idusing(student_name) gen(link)
0 perfect matches found
Added: student_name= identifier from roster100f11Sep7.dta link =
matching
score
Observations: Master N = 26 roster100f11Sep7.dta N= 182
Unique Master Cases: matched = 0 (exact = 0), unmatched = 26
.list link _merge in 1/5, clean
link _merge
1. 0.9933 1
2. 0.9933 1
3. . 1
4. 0.6420 1
5. 0.9988 1
_______
Devra Golbe
Professor of Economics
Hunter College, CUNY
NY, NY
*
*
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/