Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Merging of two datasets


From   [email protected] (William Gould, Stata)
To   [email protected]
Subject   Re: st: Merging of two datasets
Date   Wed, 09 May 2007 08:24:43 -0500

After some more thought, I have a second answer on Tobi
Tobi Brütsch's <[email protected]> problem.  To remind you, 

> I have to match two datasets from two diffrend sources but about the same
> issue:
>
>       Dataset1 (+/- 5000 observations): 
>       date            Firm     recommendation  brokerID 
>       15.03.2002      ABB             2        LEHM
>       11.01.2005      ABB             5        HESLB 
>       01.07.2005      ABB             2        JBCOB 
>       06.08.2004      ABB             3        MORGAN 
>
>       Dataset2 (+/- 2500 observations): 
>       Date             firm                brokerID        recommendation 
>       04aug2003        ABB                 MORGAN               4 
>       13nov2002        ABB                 WAREURO              4 
>       22mar2005        ABB                 JYSKE                4 
>       25jan2005        ABB                 SOGENED              4 
>       27jan2004        ABB                 PARIBEU              3 
>
> [...]
> [...] i want to analyse if the recommendations of Dataset 2 are:
> 
>     1. the same like these in set 1 (should be for some but I think not for
>        all)
>
>     2. The recommendations in set 2 are published with a certain lag,
>        perhabs 5-10 days after the same recommendation in set1.

> Now i don't know to merge the sets. [...]

My previous answer had to do with using -append- and Tobi forming the fuzzy
merge himself.

My second approach would use -merge- repeatedly.
The setup would be, 

First, 
	. use Dataset1
	. merge firm brokerID date recommendation using Dataset2
	. count if _merge==3

Write that number down.  That's the number of recommendations that are 
in exact agreement between the two sources.

Next,
	. use Dataset1, clear 
	. replace date = date + 1
	. sort firm brokerID date recommendation
	. merge firm brokerID date recommendation using Dataset2
	. count if _merge==3

Write that number down.  That's the number of recommendations that are 
in agreement where source 2 made the same recommendation 1 day later.

Next, 
	. use Datqaset1, clear 
	. replace date = date - 1
	. sort firm brokerID date recommendation
	. merge firm brokerID date recommendation using Dataset2
	. count if _merge==3

That's the number where source 2 made the same recommendation 1 day before.

Now keep doing that with date+2, date-2, and so on.

-- Bill
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index