Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Slowing process when running a program with multiple nested loops

From	David Kantor <[email protected]>
To	[email protected]
Subject	Re: st: Slowing process when running a program with multiple nested loops
Date	Mon, 14 Jan 2013 15:52:11 -0500

Hi,

Your code looks fairly straightforward, after looking at it for a minute.

At first, it looked cryptic, but once I understood it, I realized, Iwould code very similarly.

There are a few unused macros, but that's irrelevant.

We don't see the code for sma, dma, trb, or cbo. Do these getprogressively complicated?Is it possible that there is sudden jump in slowness when you switchfrom sma to dma, or to trb or cbo?

Or is it gradual through all the iterations?

(TBut you did say that they do about the same amount of calculation)

More importantly, do they alter the data? Do they alter (-save-) the data file?
These latter points may be most relevant.

The important question is, after one iteration, can the next one runwithout reloading (-use-ing) the data?If not, can you rework your code (in sma, dma, trb, and cbo) to makeit so? (That is, have them not drop or add records. If they generateor replace values of variables, have those be in designated variablesthat can be reset easily. The idea being that if the dataset changesin a significant way, then you want to be able to bring it back toits pre-iteration state easily -- using -drop- or -replace ... = .-.The last thing you should have to do is to reload the data for eachiteration. Reloading the data may be 1000 times slower thancontinuing with the same data. (I don't have any real statistics onthat factor, but 1000 is not unreasonable.)

If you can arrange it so that you don't need to reload on eachiteration (or if it is already coded that way), then you can youmove the -use- command to the top -- before the first foreach?

Note that the repeated reloading will cause slowness, but may notexactly explain why it gets progressively slower. But that may be anoperating-system issue. (It may be that after the first -use-, thefile is in cache, enabling some fast loads; later it is knocked out of cache.)

One other point is that it is not always good to -set mem- to a highvalue. It should be high enough to get the job done, plus maybe alittle margin of safety. Otherwise, you are grabbing space that mightbetter left for the operating system to make good use of (such as forcacheing files) and to run everything (including your task) smootherand faster.


HTH
--David


At 02:52 PM 1/14/2013, you wrote:

Thank you for your response.
I did mean to say that there are 7 nested loops, because there are 7
parameters that can change values, and I do not know of another way to
have this done.

So the code is as followed:
** Initialization
clear all
set mem 100m
set more off, perm
set autotabgraphs on, perm
graph drop _all
cd "C:\Users\Trades\Stata"
sysdir set PERSONAL "C:\Users\Trades\Stata\Ado"

** Setting parameters
 global freq "1m"
 global fcrc "EUR USD GBP AUD NZD NZD EUR EUR AUD"  // foreign currencies
 global bcrc "USD JPY USD USD USD JPY JPY GBP JPY"  // base currencies
 global startdate = mdy(1,1,1994)
 global enddate = mdy(12,31,2010)
 global subperiod "2002jan01 2008sep01"   // specify subperiods

local smam "2 5 10 15 20 25 50 100 150 200 250" // parameter mfor sma method

 local smab "0 0.0005 0.001 0.005 0.01 0.05"   // parameter b for sma method
 local smad "2 3 4 5"         // parameter d for sma method
 local smac "5 10 25"          // parameter c for sma method
 local sman "0"
 local smak "0"

local dmam "2 5 10 15 20 25 50 100 150 200 250" // parameter mfor dma method

 local dman "2 5 10 15 20 25 50 100 150 200"   // parameter n for dma method
 local dmab "0 0.0005 0.001 0.005 0.01 0.05"   // parameter b for dma method
 local dmad "2 3 4 5"         // parameter d for dma method
 local dmac "5 10 25"          // parameter c for dma method
 local dmak "1000"
 local trbn "5 10 15 20 25 50 100"      // parameter n for trb method

local trbb "0.0005 0.001 0.005 0.01 0.025 0.05" // parameter bfor trb method

 local trbd "2 3 4 5"         // parameter d for trb method
 local trbc "1 5 10 25"         // parameter c for trb method
 local trbm "1000"
 local trbk "0"
 local cbon "5 10 15 20 25 50 100 200"     // parameter n for cbo method
 local cbok "0.001 0.005 0.01 0.05 0.1"     // parameter k for cbo method
 local cbob "0.0005 0.001 0.005 0.01 0.05"    // parameter b for cbo method
 local cbod "0 1 2"          // parameter d for cbo method
 local cboc "1 5 10 25"         // parameter c for cbo method
 local cbom "1000"


** Loops to go through all methods and all parameters
 foreach med in sma dma trb cbo {      // loop through all the rules
  foreach m of local `med'm {      // all the m values
   foreach n of local `med'n {     // all the n values
     foreach k of local `med'k {    // all the k values
      foreach b of local `med'b {   // all the b values
       foreach d of local `med'd {  // all the d values
        foreach c of local `med'c { // all the c values
         clear
         use data
         `med', datevar(date) m(`m') n(`n') k(`k') b(`b') d(`d') c(`c')

        }
       }
      }
     }
    }
  }
 }

`med' is calling one of the four ado files that I wrote: sma, dma,
trb, and cbo.  It basically calculates profits based on the rule and
the parameters fed to the program, so I think each iteration does just
about the same amount of calculation.

The next part (which I haven't written) keeps track of results from
some methods and parameters that satisfy certain conditions. In my
opinion, this would be a minor thing if I can get the current code to
run in a reasonable amount of time.

Any help explaining why the program slows down so significantly after
a couple of hundreds of iterations will be much appreciated.

Thank you.









I think that most of would agree that we would need to see your code
to be able to say what the problem is. Meanwhile, did you mean that
the loops are nested to a depth of 7? That's unusually deep.

Just generally speaking, with loops, there are often actions that are
placed inside that don't need to be there; they can be moved "up" or
"out" (sometimes requiring a bit of modification) so as to not be done
multiple times unnecessarily. From what you describe, it seems that
the work done in each iteration is accumulating; each iteration does a
bit more work than the previous. There may be some unnecessary
repetition as described above. But it also seems that there is
something that grows and gets dragged along with each iteration --
again possibly unnecessarily. This is analogous to a cumulative song,
such as "The Twelve Days of Christmas"; the 12th verse is much longer
than the first.

On the other hand, does the true complexity of the task grow with each
iteration? Do you expect the 300th iteration to naturally be more
complex to perform than the first?

Show us your code if you want more help.

HTH
--David

--
Ly Tran
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: st: Slowing process when running a program with multiple nested loops
  - From: Ly Tran <[email protected]>

Prev by Date: st: adoupdate hangs on some websites
Next by Date: Re: st: adoupdate hangs on some websites
Previous by thread: Re: st: Slowing process when running a program with multiple nested loops
Next by thread: Re: st: Slowing process when running a program with multiple nested loops
Index(es):
- Date
- Thread