Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Slowing process when running a program with multiple nested loops


From   Ly Tran <lyhuyen@brandeis.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Slowing process when running a program with multiple nested loops
Date   Mon, 14 Jan 2013 22:23:46 -0500

Hi David,

Thank you for your quick and very thorough response.
Thanks to your suggestions, I did go back and found a way to speed up
the program by rewriting the 4 ado files (which in turns allow me to
cut back on the nested loops).

Thanks so so much.

--
Ly Tran



Hi,

Your code looks fairly straightforward, after looking at it for a minute.

At first, it looked cryptic, but once I understood it, I realized, I
would code very similarly.

There are a few unused macros, but that's irrelevant.

We don't see the code for sma, dma, trb, or cbo. Do these get
progressively complicated? Is it possible that there is sudden jump in
slowness when you switch from sma to dma, or to trb or cbo?

Or is it gradual through all the iterations?

(TBut you did say that they do about the same amount of calculation)

More importantly, do they alter the data? Do they alter (-save-) the data file?
These latter points may be most relevant.

The important question is, after one iteration, can the next one run
without reloading (-use-ing) the data? If not, can you rework your
code (in sma, dma, trb, and cbo) to make it so? (That is, have them
not drop or add records. If they generate or replace values of
variables, have those be in designated variables that can be reset
easily. The idea being that if the dataset changes in a significant
way, then you want to be able to bring it back to its pre-iteration
state easily -- using -drop- or -replace ... = .-. The last thing you
should have to do is to reload the data for each iteration. Reloading
the data may be 1000 times slower than continuing with the same data.
(I don't have any real statistics on that factor, but 1000 is not
unreasonable.)

If you can arrange it so that you don't need to reload on each
iteration (or if it is already coded that way), then you can you move
the -use- command to the top -- before the first foreach?

Note that the repeated reloading will cause slowness, but may not
exactly explain why it gets progressively slower. But that may be an
operating-system issue. (It may be that after the first -use-, the
file is in cache, enabling some fast loads; later it is knocked out of
cache.)

One other point is that it is not always good to -set mem- to a high
value. It should be high enough to get the job done, plus maybe a
little margin of safety. Otherwise, you are grabbing space that might
better left for the operating system to make good use of (such as for
cacheing files) and to run everything (including your task) smoother
and faster.

HTH
--David
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index