Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Merging of two datasets


From   Tobias Brütsch <topsi@ihcn.ch>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Merging of two datasets
Date   Wed, 9 May 2007 15:41:12 +0200

Thank you for your great answers Bill!

At the moment I was analysing the results of your first answer. The
appending of the data was succesfull, but the calculating of the lag, gives
me strange results. But this is probably a problem of the data. 

I now will check it with the second answer. I think it might be better,
because I am interessted only in a delay of about +/-20 or 30 days.

Thanks a lot for helping aganin!

Greetings

Tobi

-----Original Message-----
From: William Gould, Stata [mailto:wgould@stata.com] 
Sent: Mittwoch, 9. Mai 2007 15:25
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: Merging of two datasets 

After some more thought, I have a second answer on Tobi
Tobi Brütsch's <topsi@ihcn.ch> problem.  To remind you, 

> I have to match two datasets from two diffrend sources but about the same
> issue:
>
>       Dataset1 (+/- 5000 observations): 
>       date            Firm     recommendation  brokerID 
>       15.03.2002      ABB             2        LEHM
>       11.01.2005      ABB             5        HESLB 
>       01.07.2005      ABB             2        JBCOB 
>       06.08.2004      ABB             3        MORGAN 
>
>       Dataset2 (+/- 2500 observations): 
>       Date             firm                brokerID        recommendation 
>       04aug2003        ABB                 MORGAN               4 
>       13nov2002        ABB                 WAREURO              4 
>       22mar2005        ABB                 JYSKE                4 
>       25jan2005        ABB                 SOGENED              4 
>       27jan2004        ABB                 PARIBEU              3 
>
> [...]
> [...] i want to analyse if the recommendations of Dataset 2 are:
> 
>     1. the same like these in set 1 (should be for some but I think not
for
>        all)
>
>     2. The recommendations in set 2 are published with a certain lag,
>        perhabs 5-10 days after the same recommendation in set1.

> Now i don't know to merge the sets. [...]

My previous answer had to do with using -append- and Tobi forming the fuzzy
merge himself.

My second approach would use -merge- repeatedly.
The setup would be, 

First, 
	. use Dataset1
	. merge firm brokerID date recommendation using Dataset2
	. count if _merge==3

Write that number down.  That's the number of recommendations that are 
in exact agreement between the two sources.

Next,
	. use Dataset1, clear 
	. replace date = date + 1
	. sort firm brokerID date recommendation
	. merge firm brokerID date recommendation using Dataset2
	. count if _merge==3

Write that number down.  That's the number of recommendations that are 
in agreement where source 2 made the same recommendation 1 day later.

Next, 
	. use Datqaset1, clear 
	. replace date = date - 1
	. sort firm brokerID date recommendation
	. merge firm brokerID date recommendation using Dataset2
	. count if _merge==3

That's the number where source 2 made the same recommendation 1 day before.

Now keep doing that with date+2, date-2, and so on.

-- Bill
wgould@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index