Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

AW: st: "Wrong" result with encode / merge ?


From   "Thomas Erdmann" <tom.erdmann@web.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   AW: st: "Wrong" result with encode / merge ?
Date   Thu, 23 Nov 2006 22:34:48 +0100

Thanks for both your feedback. 
Seems like my first answer did not reach the list.
Obviously I overestimated what -encode- is doing.

- Tom


 
 
-----Ursprüngliche Nachricht-----
Von: statalist-owner@hsphsun2.harvard.edu
[mailto:statalist-owner@hsphsun2.harvard.edu] Im Auftrag von Austin Nichols
Gesendet: Donnerstag, 23. November 2006 14:04
An: statalist@hsphsun2.harvard.edu; tom.erdmann@stud.unibas.ch
Betreff: Re: st: "Wrong" result with encode / merge ?

Thomas Erdmann--
You should merge on id7temp, not id7, since when you -encode-, the
string id7temp is converted to numeric values, and the order of
assignment may differ across your two datasets (try -la li- to see the
assignment in each dataset).  If you want to use a numeric id, you can
generate one using a one-to-one mapping, using -gen- and a loop over
all possible characters, but it is more straightforward to merge on
the string var.

On 11/23/06, Thomas Erdmann <tom.erdmann@stud.unibas.ch> wrote:
> Hi,
>
> I have a dataset with ids that look like: AT18679U (two strings followed
by
> 5 numbers, optionally followed by another string)
>
> Between the two datasets I would like to merge only the first 7 digits are
> equal, therefore I generated
>
> generate id7temp=substr(id,1,7)
> encode id7temp, gen(id7)
> sort id7
>
> and merged the two datasets by id7. When I quality checked the results
there
> were several mismatches, which don't seem to happen if I use the string id
> and not the encoded one. Why is that?
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


-- 
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.409 / Virus Database: 268.14.11/543 - Release Date: 20.11.2006



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index