Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: matching observations for merging


From   Abhimanyu Arora <abhimanyu.arora1987@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: matching observations for merging
Date   Thu, 17 Jun 2010 18:03:46 +0200

Thar's a good idea, Maarten, thanks very much indeed.
Abhimanyu

On Thu, Jun 17, 2010 at 5:56 PM, Maarten buis <maartenbuis@yahoo.co.uk> wrote:
> --- On Thu, 17/6/10, Abhimanyu Arora wrote:
>> I have to files to be merged. Is it possible to merge using
>> an approximation of the merging variable? In other words, if
>> my merging variable is say, country, there could be a slight change in
>> spelling of some countries (Afghanistan/ Afganistan) in the two
>> files...Is there a more efficient way than just going through all 200+
>> countries and checking spelling consistency?
>
> For countries the quickest way is to
> 1) keep in each dataset one observation per country
> 2) merge the 2 datasets
> 3) keep if _merge != 3
> 4) sort on country name
> 5) list
>
> This will display a list of troublesome country names, which is
> usually so short that it doesn't pay to do anything more fancy.
>
> With this list you can create a recode .do file which harmonizes
> country names before the final merge.
>
> Moreover, this harmonization do file can be a good starting position
> in any subsequent project involving the merge on country names, as the
> kind of inconsistencies in country names are pretty similar across
> files. So at the begining of each project you start by running the
> harmonization do-file of the last project, than go through steps 1-5
> to find any mismatches that weren't handeld in the last do-file, and
> add those to your new harmonization file. After 4 or 5 projects you
> will hardly find any mismatch anymore.
>
> Hope this helps,
> Maarten
>
> --------------------------
> Maarten L. Buis
> Institut fuer Soziologie
> Universitaet Tuebingen
> Wilhelmstrasse 36
> 72074 Tuebingen
> Germany
>
> http://www.maartenbuis.nl
> --------------------------
>
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index