To elaborate: I think you need to use Nick's approach to finding the most common values in dataset2, then -keep- only those lines, save that as dataset3, and then merge dataset3 into dataset1.

--Nick Winter

At 06:42 PM 11/5/2004 +0000, you wrote:

--------------------------------------------------------Your most common values can be obtained by bysort code1 code2 : gen count = - _N [!!! NB - ] bysort code1 (count code2) : gen mode = code2[1] Nick n.j.cox@durham.ac.uk Jason Hwang > I didn't describe very well last time what I wanted to do. Let me try > again. > > I have two datasets I'm trying to merge of the following form. > > dataset1: > > code1 output > 1111 100 > 5555 340 > > dataset2: > > code2 pchange code1 > 3431 .5 1111 > 3431 .5 1111 > 3450 -.5 1111 > 3451 .7 1111 > 9903 .4 5555 > 9945 .1 5555 > 9903 .4 5555 > 9905 -.6 5555 > 9945 .1 5555 > > I'm trying to use dataset1 as the original (master) and merge into it > dataset2. Problem: each code1 maps to many code2s. So here's > what I would > like to do: for each code1, find a code2 which corresponds to > it with the > greatest frequency. So for code1, 1111, I want 3431. For > 5555, both 9903 > and 9945 occur twice. In this case, I'll just take whichever shows up > first in the sorted list; i.e. 9903. > > The final output I'm looking for would be: > > code1 code2 output pchange > 1111 3431 100 .5 > 5555 9903 340 .4 > > Could some one how to write a code for this procedure? Thank you very > much. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

