Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: looping over files -- speed and Stata/MP
From
Dimitri Szerman <[email protected]>
To
statalist <[email protected]>
Subject
st: looping over files -- speed and Stata/MP
Date
Wed, 16 Mar 2011 14:48:11 +0000
Hello,
In constructing a data set, I have to loop over hundreds of thousands
of files. Simply put, this is what I do:
! dir "mydir" /a-d /b > filelist.txt // list of files to be imported
file open LIST using "filelist.txt", read
file read LIST line
while r(eof)==0 {
(a bunch of Stata commands)
save mydir2\\`line', replace
file read LIST line
}
file close LIST
(In fact, I run a loop like this twice (first to import csv into dta;
another to work (clean) the dta files). As it stands now, my code
takes around 12 hours to run. My question is: will Stata/MP make it
run faster? (For those familiar with Matlab, I guess this boils down
to: does Stata/MP have something along the lines of "parfor", i.e., a
"parallel-for" command?) More broadly, can anyone think of a way of
speeding this up?
Many thanks,
Dimitri
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/