Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: one-to-many mapping


From   Jason Hwang <jjhwang@fas.harvard.edu>
To   statalist@hsphsun2.harvard.edu
Subject   st: one-to-many mapping
Date   Thu, 4 Nov 2004 16:59:58 -0500 (EST)

Hello,

I can't find help on this problem on Statalist. I would greatly appreciate
your help.

I have two datasets which I'm trying to merge. Problem is that the mapping
is not one-to-one. Consider the following example:

code1	code2

1111 	2290
1111	2291
1111	2292
1111 	2292
1111	2290

5555	6700
5555	6701
5555	6700

As you can see sometime a single "code1" maps to many "code2"s. I need a
one-to-one mapping to merge the data and I'm trying to do this two ways.

1. Take a unweighted average of the "code2"s for each "code1." That is
let's say for 1111, I just want an average of a variable for 2290, 2291
and 2992 in the second dataset.

But how do I do this when I have hundreds of unique "code1"s, and each
corresponds to a different number of "code2"s? (I can imagine the code if
each code1 all mapped to the same number of code2s but this is not the
case.)

2. Use the code2 with maximum number of frequency. That is, 5555 will map
to 6700 since 6700 occurs twice while 6701 occurs once. For 1111, both
2290 and 2292 occur twice and in this case, I'll just take the one that
appears first, 2290.

I'm not sure how to write the code for this procedure. I would greatly
appreciate any help. Thank you.

Jason
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index