Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: looping over files -- speed and Stata/MP


From   Dimitri Szerman <[email protected]>
To   statalist <[email protected]>
Subject   st: looping over files -- speed and Stata/MP
Date   Wed, 16 Mar 2011 14:48:11 +0000

Hello,

In constructing a data set, I have to loop over hundreds of thousands
of files. Simply put, this is what I do:

! dir "mydir" /a-d /b > filelist.txt         // list of files to be imported
file open LIST using "filelist.txt", read
file read LIST line
while r(eof)==0 {

     (a bunch of Stata commands)

save mydir2\\`line', replace
file read LIST line
}
file close LIST


(In fact, I run a loop like this twice (first to import csv into dta;
another to work (clean) the dta files). As it stands now, my code
takes around 12 hours to run. My question is: will Stata/MP make it
run faster? (For those familiar with Matlab, I guess this boils down
to: does Stata/MP have something along the lines of "parfor", i.e., a
"parallel-for" command?) More broadly, can anyone think of a way of
speeding this up?

Many thanks,
Dimitri
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index