Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Dorothy Bridges <dbstata@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Reclink: high matching score, but no match |
Date | Fri, 24 Jan 2014 10:43:51 -0800 |
Hello everyone (especially Devra and Michael): Was this ever resolved? I'm having the exact same problem. Code and (partial) output copied below. I usually use reclink without any problems. reclink entidad municipio using "ESDATA/dta/MunicipioLevel/ESDATAMun_Jan2014.dta", /// idu(idu) idm(idm) gen(match) required(entidad) listing the output: entidad municipio Umunicipio match ZULIA VALMORE RODRIGUEZ SAN FRANCISCO 0.9961 NUEVA ESPARTA ANTOLIN DEL CAMPO SOTILLO 0.9961 YARACUY JOSE ANTONIO PAEZ BRUZUAL 0.9961 ZULIA LA CANADA DE URDANETA FRANCISCO J PULG 0.9974 ARAGUA MARIO BRICENO IRAGORRY M.OCUMARE D LA COSTA 0.9977 ZULIA ROSARIO DE PERIJA MARA 0.9982 ARAGUA FRANCISCO LINARES ALCANTARA FRANCISCO LINARES A. 0.9992 BARINAS ALBERTO ARVELO TORREALBA ZAMORA 0.9995 SUCRE RIBERO MEJIA 1.0000 CARABOBO BEJUMA SIFONTES 1.0000 MIRANDA URDANETA RIVAS DAVILA 1.0000 YARACUY MANUEL MONGE INDEPENDENCIA 1.0000 DELTA AMACURO CASACOIMA TINACO 1.0000 SUCRE SUCRE MONTES 1.0000 NUEVA ESPARTA GOMEZ GARCIA 1.0000 ARAGUA JOSE ANGEL LAMAS JOSE ANGEL LAMAS 1.0000 BARINAS CRUZ PAREDES BARINAS 1.0000 MIRANDA SUCRE RANGEL 1.0000 MIRANDA SIMON BOLIVAR PUEBLO LLANO 1.0000 ZULIA PAEZ MACHIQUES DE P 1.0000 On Wed, Dec 28, 2011 at 10:04 AM, Devra Golbe <dgolbe@gmail.com> wrote: > Michael, > > student_name is non-numeric. After some additional data cleaning and the > resulting reduction of the set that needed a fuzzy match reclink succeeded > with student_name as the idusing variable, so my original problem is solved. > > But working with a smaller data set, I have an example where the non-numeric > identifier and a numeric identifier fail, but a different numeric identifier > succeeds. I'll send those data and the do-file to you off-list. > > Thanks and happy new year. > > Devra > > > On 12/28/2011 11:49 AM Michael Blasnik wrote: >> >> It looks like this is a bug -- is student_name numeric? If not, you >> may want to try encoding it and trying again. If that isn't the >> problem, it might be best if you either send me the data or a trace >> log off-list to see if i can figure it out, but I may not get a chance >> to figure it out until after the holidays. >> >> Michael >> >> On Sat, Dec 24, 2011 at 4:10 PM, Devra Golbe<dgolbe@gmail.com> wrote: >>> >>> I am using Michael Blasnik's reclink (from SSC) to match records. I get >>> extremely high matching scores, and yet the records do not match. Can >>> anyone help? My code and relevant output are pasted below. >>> >>> Thanks and happy holidays, >>> Devra >>> >>> ****** >>> . sort lname fname >>> . gen idmaster=_n >>> .tempfile ps1a >>> .save `ps1a', replace >>> . clear >>> .use roster100f11Sep7.dta >>> .sort lname fname >>> .save, replace >>> .clear >>> .use `ps1a' >>> >>> .reclink lname fname using roster100f11Sep7.dta, /// >>> idmaster(idmaster) idusing(student_name) gen(link) >>> >>> 0 perfect matches found >>> >>> >>> Added: student_name= identifier from roster100f11Sep7.dta link = >>> matching >>> score >>> Observations: Master N = 26 roster100f11Sep7.dta N= 182 >>> Unique Master Cases: matched = 0 (exact = 0), unmatched = 26 >>> >>> .list link _merge in 1/5, clean >>> >>> link _merge >>> 1. 0.9933 1 >>> 2. 0.9933 1 >>> 3. . 1 >>> 4. 0.6420 1 >>> 5. 0.9988 1 >>> >>> _______ >>> Devra Golbe >>> Professor of Economics >>> Hunter College, CUNY >>> NY, NY >>> * >>> * > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/