[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Merge Question

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: Merge Question
Date	Fri, 5 Nov 2004 18:42:30 -0000

Your most common values can be obtained by 

bysort code1 code2 : gen count = - _N    [!!! NB - ] 
bysort code1 (count code2) : gen mode = code2[1] 

Nick 
[email protected] 

Jason Hwang
 
> I didn't describe very well last time what I wanted to do. Let me try
> again.
> 
> I have two datasets I'm trying to merge of the following form.
> 
> dataset1:
> 
> code1	output
> 1111	100
> 5555	340
> 
> dataset2:
> 
> code2	pchange	code1
> 3431	.5	1111
> 3431	.5	1111
> 3450	-.5	1111
> 3451	.7	1111
> 9903	.4	5555
> 9945	.1	5555
> 9903	.4	5555
> 9905	-.6	5555
> 9945	.1	5555
> 
> I'm trying to use dataset1 as the original (master) and merge into it
> dataset2. Problem: each code1 maps to many code2s. So here's 
> what I would
> like to do: for each code1, find a code2 which corresponds to 
> it with the
> greatest frequency. So for code1, 1111, I want 3431. For 
> 5555, both 9903
> and 9945 occur twice. In this case, I'll just take whichever shows up
> first in the sorted list; i.e. 9903.
> 
> The final output I'm looking for would be:
> 
> code1	code2	output	pchange
> 1111	3431	100	.5
> 5555	9903	340	.4
> 
> Could some one how to write a code for this procedure? Thank you very
> much.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: RE: Merge Question
  - From: Nick Winter <[email protected]>

Prev by Date: st: Merge Question
Next by Date: st: note option in the twoway graph going nuts
Previous by thread: st: Merge Question
Next by thread: Re: st: RE: Merge Question
Index(es):
- Date
- Thread