[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Kit Baum <baum@bc.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: Re: better match-management algorithm |

Date |
Sat, 4 Aug 2007 09:28:47 -0400 |

As much as I enjoy Mata programming, I don't think this case obviously calls for Mata. I think it should be feasible to do this without explicit loops at all, using something like the third stanza of the enclosed (the first two just make up some transactions-time data for stock and options quotes:

clear

set obs 100

egen transtime = fill(1 2 4)

g stockprice = 10*uniform()+50

sort transtime

save stox, replace

clear

set obs 75

egen transtime = fill(1.25 3.75 5.2)

g optprice = 10*uniform()+45

sort transtime

save opts, replace

use stox

merge transtime using opts

sort transtime

g prevstock=cond(stockprice[_n-1]<.,stockprice[_n-1], ///

cond(stockprice[_n-2]<.,stockprice[_n-2],stockprice[_n-3])) ///

if optprice<.

l transtime optprice prevstock if optprice<.

I am not usually a fan of nested cond() calls, but in this case it seems to work well. If you scale it up to 500,000 stock quotes and 375,000 options quotes, it does the job (without the list) in 4.88 seconds (Stata 10/MP2; it is using parallel computation heavily).

Whenever an explicit loop appears in the logic, one should think twice or thrice about whether there is a better way. In Stata there often is.

Kit Baum, Boston College Economics and DIW Berlin

http://ideas.repec.org/e/pba1.html

An Introduction to Modern Econometrics Using Stata:

http://www.stata-press.com/books/imeus.html

On Aug 4, 2007, at 2:33 AM, Tobias wrote:

I encountered the following problem in a finance research project: two

tables, one with option prices, the other with (underlying) stock prices.

The task is to match the appropriate stock price to each option price

observation. My current solution works, but seems to be inefficient due to

tremendous processing time (> 4h).

My current solution is the following (I refer to the following numbers in

the code below):

1) Fetch number of observations from underlying table.

2) Fetch number of obs. from option table

3) Merge the underlying prices to the option prices (one-to-one merge)

4) Using two nested forvalues loops, I iterate over the option observations

to find an appropriate underlying price again iterating over the underlying

prices in the second forvalues loop. [The matching criteria are an identical

ISIN number, identical trading_date, and that the trading time of the

subsequent underlying is bigger than the option trading time, i.e. looking

for the most recent underlying price.]

Before writing down my code, I would have the following questions:

A) IS THERE A MORE EFFICIENT WAY TO CARRY OUT THE CONDITIONAL MATCHING

WITHOUT HAVING TO ITERATE OVER EACH AND EVERY OBSERVATION ?

B) IF NOT, IS IT POSSIBLE TO 'OUTSOURCE' THE TASK TO A MATA PROGRAM, SUCH

THAT THE COMPILATION OF THE LOOP-CODE IS DONE ONCE INSTEAD OF A MILLION

TIMES ?

I thought about the Mata possibility when I read in a presentation by Kit

Baum:

"Your ado-files may perform some complicated tasks which involve many

invocations of the same commands. Stata's ado-file language is easy to read

and write, but it is interpreted: Stata must evaluate each statement and

translate it into machine code. The new Mata programming language (help

mata) creates compiled code which can run much faster than ado-file code."

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: Binomial regression** - Next by Date:
**Re: st: A gentler introduction to Statalist and Seven Deadly Sins** - Previous by thread:
**st: Re: question about cumulative density** - Next by thread:
**Re: st: stcox: weighted regression changes the # of observations** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |