Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Problem with Disk Wait While Loading Subset of Observations


From   David Phillips <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   st: Problem with Disk Wait While Loading Subset of Observations
Date   Tue, 2 Jul 2013 05:50:27 +0000

Dear Statalist,

I'm using a Unix-based cluster of computers operating in Oracle Grid Engine with Stata-MP 12 to repeatedly (in parallel) load separate subsets of a single 8 GB file, by taking advantage of Stata syntax such as the following:

use in 1/10000 using `file'

I have found an unexpected phenomenon however. The jobs will stall in a 'disk wait' status and take hours to load the data. Interestingly however, if I remove the "in ... using" statement from the command (so that it's simply "use `file'"), the jobs take a perfectly reasonable 20 minutes or so to load the file. How could loading the full file be less taxing on these machines than loading a subset? 

Thanks,
David
 

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index