Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: problems with merging datasets


From   "Murali Kuchibhotla" <muralik@iastate.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: problems with merging datasets
Date   Mon, 14 Jul 2008 23:17:04 -0500 (CDT)

Hi Joseph,
          You guessed correctly: the problem was caused by the existence of
space characters in one city dataset and not the other.

Murali

> Murali Kuchibhotla wrote:
> 
>       I managed to fix the problem- it turns out that even thought the city
> names are the same, the string dimensions of the 2 variables are different,
> which caused the merge to fail.Thanks
> 
> --------------------------------------------------------------------------------
> 
> Could you clarify what you mean by different string dimensions?
> 
> A difference in dimension (e.g., type = str20 in one and str2 in the other)
> won't affect Stata's ability to merge on string variables--Stata will
> automatically adjust to the longer string length during the merge (see
> below).
> 
> Do you mean that there were unsuspected invisible characters (e.g., padding
> with space characters) in the city data in one dataset and not in the other?
> 
> Joseph Coveney
> 
> .. clear *
> 
> .. set more off
> 
> .. tempfile tmpfil0
> 
> .. quietly set obs 5
> 
> .. generate str20 city = char(65 + _n) + char(66 + _n)
> 
> .. sort city
> 
> .. quietly save `tmpfil0'
> 
> .. compress
> city was str20 now str2
> 
> .. merge city using `tmpfil0'
> city was str2 now str20
> 
> .. tabulate _merge
> 
>      _merge |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           3 |          5      100.00      100.00
> ------------+-----------------------------------
>       Total |          5      100.00
> 
> .. drop _merge
> 
> .. quietly replace city = city + "       "
> 
> .. sort city
> 
> .. merge city using `tmpfil0'
> 
> .. tabulate _merge
> 
>      _merge |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           1 |          5       50.00       50.00
>           2 |          5       50.00      100.00
> ------------+-----------------------------------
>       Total |         10      100.00
> 
> .. quietly keep if _merge == 1
> 
> .. drop _merge
> 
> .. quietly replace city = trim(city)
> 
> .. sort city
> 
> .. merge city using `tmpfil0'
> 
> .. assert _merge == 3
> 
> .. erase `tmpfil0'
> 
> .. exit
> 
> end of do-file
> 
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 


Murali Kuchibhotla
Department of Economics
Iowa State University
Office:75,Heady
Phone:515-294-5452


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index