Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Re: insheet multi threading


From   Argyn Kuketayev <akuketayev@mail.primaticsfinancial.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Re: insheet multi threading
Date   Mon, 2 May 2011 14:18:58 -0400

Nick

On Mon, May 2, 2011 at 12:46 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>
> On the main issue, my short answer is that I don't know.

Iike Mark said, I'm not aware of a Stata API which would allow me to
write concurrently executed ado's. I don't think it can be done.

>
> A longer answer is that -insheet- depends on parts of a datafile
> having the same structure as the whole. So, very likely -- if this is
> parallelisable -- much of the code would require lots of compatibility
> checks to ensure consistency of input.
>
> My guess is that -insheet- peeks at the top of the data file, makes a
> guess at its structure, and then keeps on going unless and until it
> finds a problem.
>

I've written parsing utilities a few times, and they can be
parallelized, that's why I'm so confident in my disappointment with
Stata. the standard way of handling this task is to write a sequential
reader, which simply reads from the disk then dumps lines into a
queue. Then concurrent parsers pick bunches of lines, and parse them,
and dump the parsed observations into another queue, where something
will aggregate the observations into a data set.

If the disk reading part was a bottleneck, then I wouldn't see 100%
CPU load on one core, and there would be other symptoms pointing to
this situation. At the moment it looks to me that reading and parsing
are sequential, and that parsing is the bottleneck, which is a waste
of CPUs. I have 8 cores, and want them all be used. Reasonable
request, one would think.

cheers
-- 
Argyn Kuketayev
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index