Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Re: insheet multi threading


From   Argyn Kuketayev <akuketayev@mail.primaticsfinancial.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Re: insheet multi threading
Date   Mon, 2 May 2011 09:30:48 -0400

I'm not talking about some obscure command either. it's a very basic
task, and I'm sure everyone does it daily: read CSV files. it takes
over 1 hour on 8-core machine to read 13GB file, because CPU load is
12% all the time, one core is working.

it's a junior programmer level assignment to parallelize the parsing
part, that's why i'm surprised Stata didn't do it. it's frustrating
because sometime i get CSVs during the day, and have to wait long long
time before i can upload them into Stata. once in .dta format, all is
fast: reading and writing. so, it's clearly parsing part that is slow.

On Mon, May 2, 2011 at 12:24 AM, Joseph Coveney <jcoveney@bigplanet.com> wrote:
> Are circumstances such that you can have Stata convert your CSV files to Stata
> format overnight?  I'm assuming that Stata won't spend much time parsing its own
> file format the next morning when you go to use the datasets.
>
> Joseph Coveney
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
Argyn Kuketayev

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index