Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Merge Question


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Merge Question
Date   Fri, 5 Nov 2004 18:42:30 -0000

Your most common values can be obtained by 

bysort code1 code2 : gen count = - _N    [!!! NB - ] 
bysort code1 (count code2) : gen mode = code2[1] 

Nick 
n.j.cox@durham.ac.uk 

Jason Hwang
 
> I didn't describe very well last time what I wanted to do. Let me try
> again.
> 
> I have two datasets I'm trying to merge of the following form.
> 
> dataset1:
> 
> code1	output
> 1111	100
> 5555	340
> 
> dataset2:
> 
> code2	pchange	code1
> 3431	.5	1111
> 3431	.5	1111
> 3450	-.5	1111
> 3451	.7	1111
> 9903	.4	5555
> 9945	.1	5555
> 9903	.4	5555
> 9905	-.6	5555
> 9945	.1	5555
> 
> I'm trying to use dataset1 as the original (master) and merge into it
> dataset2. Problem: each code1 maps to many code2s. So here's 
> what I would
> like to do: for each code1, find a code2 which corresponds to 
> it with the
> greatest frequency. So for code1, 1111, I want 3431. For 
> 5555, both 9903
> and 9945 occur twice. In this case, I'll just take whichever shows up
> first in the sorted list; i.e. 9903.
> 
> The final output I'm looking for would be:
> 
> code1	code2	output	pchange
> 1111	3431	100	.5
> 5555	9903	340	.4
> 
> Could some one how to write a code for this procedure? Thank you very
> much.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index