Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Slowing process when running a program with multiple nested loops


From   David Kantor <[email protected]>
To   [email protected]
Subject   Re: st: Slowing process when running a program with multiple nested loops
Date   Mon, 14 Jan 2013 15:52:11 -0500

Hi,

Your code looks fairly straightforward, after looking at it for a minute.
At first, it looked cryptic, but once I understood it, I realized, I would code very similarly.
There are a few unused macros, but that's irrelevant.

We don't see the code for sma, dma, trb, or cbo. Do these get progressively complicated? Is it possible that there is sudden jump in slowness when you switch from sma to dma, or to trb or cbo?
Or is it gradual through all the iterations?

(TBut you did say that they do about the same amount of calculation)

More importantly, do they alter the data? Do they alter (-save-) the data file?
These latter points may be most relevant.
The important question is, after one iteration, can the next one run without reloading (-use-ing) the data? If not, can you rework your code (in sma, dma, trb, and cbo) to make it so? (That is, have them not drop or add records. If they generate or replace values of variables, have those be in designated variables that can be reset easily. The idea being that if the dataset changes in a significant way, then you want to be able to bring it back to its pre-iteration state easily -- using -drop- or -replace ... = .-. The last thing you should have to do is to reload the data for each iteration. Reloading the data may be 1000 times slower than continuing with the same data. (I don't have any real statistics on that factor, but 1000 is not unreasonable.)

If you can arrange it so that you don't need to reload on each iteration (or if it is already coded that way), then you can you move the -use- command to the top -- before the first foreach?

Note that the repeated reloading will cause slowness, but may not exactly explain why it gets progressively slower. But that may be an operating-system issue. (It may be that after the first -use-, the file is in cache, enabling some fast loads; later it is knocked out of cache.)

One other point is that it is not always good to -set mem- to a high value. It should be high enough to get the job done, plus maybe a little margin of safety. Otherwise, you are grabbing space that might better left for the operating system to make good use of (such as for cacheing files) and to run everything (including your task) smoother and faster.

HTH
--David


At 02:52 PM 1/14/2013, you wrote:
Thank you for your response.
I did mean to say that there are 7 nested loops, because there are 7
parameters that can change values, and I do not know of another way to
have this done.

So the code is as followed:
** Initialization
clear all
set mem 100m
set more off, perm
set autotabgraphs on, perm
graph drop _all
cd "C:\Users\Trades\Stata"
sysdir set PERSONAL "C:\Users\Trades\Stata\Ado"

** Setting parameters
 global freq "1m"
 global fcrc "EUR USD GBP AUD NZD NZD EUR EUR AUD"  // foreign currencies
 global bcrc "USD JPY USD USD USD JPY JPY GBP JPY"  // base currencies
 global startdate = mdy(1,1,1994)
 global enddate = mdy(12,31,2010)
 global subperiod "2002jan01 2008sep01"   // specify subperiods


local smam "2 5 10 15 20 25 50 100 150 200 250" // parameter m for sma method
 local smab "0 0.0005 0.001 0.005 0.01 0.05"   // parameter b for sma method
 local smad "2 3 4 5"         // parameter d for sma method
 local smac "5 10 25"          // parameter c for sma method
 local sman "0"
 local smak "0"

local dmam "2 5 10 15 20 25 50 100 150 200 250" // parameter m for dma method
 local dman "2 5 10 15 20 25 50 100 150 200"   // parameter n for dma method
 local dmab "0 0.0005 0.001 0.005 0.01 0.05"   // parameter b for dma method
 local dmad "2 3 4 5"         // parameter d for dma method
 local dmac "5 10 25"          // parameter c for dma method
 local dmak "1000"
 local trbn "5 10 15 20 25 50 100"      // parameter n for trb method
local trbb "0.0005 0.001 0.005 0.01 0.025 0.05" // parameter b for trb method
 local trbd "2 3 4 5"         // parameter d for trb method
 local trbc "1 5 10 25"         // parameter c for trb method
 local trbm "1000"
 local trbk "0"
 local cbon "5 10 15 20 25 50 100 200"     // parameter n for cbo method
 local cbok "0.001 0.005 0.01 0.05 0.1"     // parameter k for cbo method
 local cbob "0.0005 0.001 0.005 0.01 0.05"    // parameter b for cbo method
 local cbod "0 1 2"          // parameter d for cbo method
 local cboc "1 5 10 25"         // parameter c for cbo method
 local cbom "1000"


** Loops to go through all methods and all parameters
 foreach med in sma dma trb cbo {      // loop through all the rules
  foreach m of local `med'm {      // all the m values
   foreach n of local `med'n {     // all the n values
     foreach k of local `med'k {    // all the k values
      foreach b of local `med'b {   // all the b values
       foreach d of local `med'd {  // all the d values
        foreach c of local `med'c { // all the c values
         clear
         use data
         `med', datevar(date) m(`m') n(`n') k(`k') b(`b') d(`d') c(`c')

        }
       }
      }
     }
    }
  }
 }

`med' is calling one of the four ado files that I wrote: sma, dma,
trb, and cbo.  It basically calculates profits based on the rule and
the parameters fed to the program, so I think each iteration does just
about the same amount of calculation.

The next part (which I haven't written) keeps track of results from
some methods and parameters that satisfy certain conditions. In my
opinion, this would be a minor thing if I can get the current code to
run in a reasonable amount of time.

Any help explaining why the program slows down so significantly after
a couple of hundreds of iterations will be much appreciated.

Thank you.









I think that most of would agree that we would need to see your code
to be able to say what the problem is. Meanwhile, did you mean that
the loops are nested to a depth of 7? That's unusually deep.

Just generally speaking, with loops, there are often actions that are
placed inside that don't need to be there; they can be moved "up" or
"out" (sometimes requiring a bit of modification) so as to not be done
multiple times unnecessarily. From what you describe, it seems that
the work done in each iteration is accumulating; each iteration does a
bit more work than the previous. There may be some unnecessary
repetition as described above. But it also seems that there is
something that grows and gets dragged along with each iteration --
again possibly unnecessarily. This is analogous to a cumulative song,
such as "The Twelve Days of Christmas"; the 12th verse is much longer
than the first.

On the other hand, does the true complexity of the task grow with each
iteration? Do you expect the 300th iteration to naturally be more
complex to perform than the first?

Show us your code if you want more help.

HTH
--David

--
Ly Tran
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index