Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: shell without waiting in unix batch mode?


From   M Hollis <m73hollis_stata@yahoo.com>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   st: shell without waiting in unix batch mode?
Date   Thu, 21 Jun 2012 10:19:16 -0700 (PDT)

Hi,

I'm running some do-files on a unix server in batch mode. The server has multiple processors and so there are instances where I'd like to be able to have my do-file complete some initial data preparation tasks and then submit multiple jobs simultaneously (in parallel). In this particular case I'm submitting a job to R, using Amelia for MI, and the imputation takes quite a while to run. I'd like to run the imputation separately for men and women and it would be ideal to run the analysis in parallel by calling R twice, but I can't figure out how to use the shell command without having Stata wait for the process to complete before continuing. According to the help file, winexec doesn't wait for the shell command to complete, but it's not available for Unix in console mode (I tried using it and it doesn't cause an error, but it also doesn't seem to actually work). I also tried using an &, the traditional way to tell Unix not to wait, but it doesn't seem to
 work.

So, for instance, my do-file has the command:

shell (R CMD BATCH "AmeliaMen.R" "AmeliaMen.txt") &

shell (R CMD BATCH "AmeliaWomen.R" "AmeliaWomen.txt") &


But Stata waits for the men's R script to complete before starting the women's. 

I thought in the past I had tried something similar but with calling Stata in batch mode rather than R and it did work, but now I just tested it and it doesn't. Here's a stata-only example. I created three do-files:
------------------------
ExampleMaster.do
-----------------------
set rmsg on
shell stata -b do "ExampleA" &
shell stata -b do "ExampleB" &

----------------------
ExampleA.do
----------------------
forvalues i=1/100 {
display "A`i'"
sleep 2000
}

--------------------
ExampleB.do
----------------------
forvalues i=1/100 {
display "B`i'"
sleep 4000
}

-------------------------
I then ran the ExampleMaster.do file by typing the following at the Unix command prompt:

stata -b do ExampleMaster &

Unfortunately, the do-file executed the two sub-routines, ExampleA and ExampleB, serially rather than in parallel. Here's the log of ExampleMaster.do.

. do ExampleMaster 
 

. set rmsg on
r; t=0.00 12:43:38

. shell stata -b do "ExampleA" &

[1] 379
r; t=200.34 12:46:59

. shell stata -b do "ExampleB" &

[1] 908
r; t=400.29 12:53:39

. 
end of do-file
r; t=0.00 12:53:39


If would be great if I could figure out how to do this. There are many cases where I loop my analysis over different sub-populations and so submitting those sub-analyses in parallel rather than looping serially would save a lot of time. I could then use a system where Stata checks for the presence of all of the completed results files before proceeding, as suggested here: http://www.stata.com/statalist/archive/2002-06/msg00295.html.

Thanks for your help,

Matissa Hollister

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index