Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: problems with merging datasets


From   "Joseph Coveney" <jcoveney@bigplanet.com>
To   "Statalist" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: problems with merging datasets
Date   Sun, 13 Jul 2008 12:02:00 +0900

Murali Kuchibhotla wrote:

     I managed to fix the problem- it turns out that even thought the city
names are the same, the string dimensions of the 2 variables are different,
which caused the merge to fail.Thanks

--------------------------------------------------------------------------------

Could you clarify what you mean by different string dimensions?

A difference in dimension (e.g., type = str20 in one and str2 in the other)
won't affect Stata's ability to merge on string variables--Stata will
automatically adjust to the longer string length during the merge (see
below).

Do you mean that there were unsuspected invisible characters (e.g., padding
with space characters) in the city data in one dataset and not in the other?

Joseph Coveney

. clear *

. set more off

. tempfile tmpfil0

. quietly set obs 5

. generate str20 city = char(65 + _n) + char(66 + _n)

. sort city

. quietly save `tmpfil0'

. compress
city was str20 now str2

. merge city using `tmpfil0'
city was str2 now str20

. tabulate _merge

    _merge |      Freq.     Percent        Cum.
------------+-----------------------------------
         3 |          5      100.00      100.00
------------+-----------------------------------
     Total |          5      100.00

. drop _merge

. quietly replace city = city + "       "

. sort city

. merge city using `tmpfil0'

. tabulate _merge

    _merge |      Freq.     Percent        Cum.
------------+-----------------------------------
         1 |          5       50.00       50.00
         2 |          5       50.00      100.00
------------+-----------------------------------
     Total |         10      100.00

. quietly keep if _merge == 1

. drop _merge

. quietly replace city = trim(city)

. sort city

. merge city using `tmpfil0'

. assert _merge == 3

. erase `tmpfil0'

. exit

end of do-file



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index