Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Improving code speed

 From Luis Aguiar To statalist@hsphsun2.harvard.edu Subject Re: st: Improving code speed Date Wed, 5 Jun 2013 12:39:30 +0200

```Dear Georges and Nick,

Thanks again for the advice! I was able to translate my code into mata
and it worked perfectly. The gains in speed are indeed amazing!

Best regards,
Luis

2013/5/23 George Vega Yon <g.vegayon@gmail.com>:
> Dear Luis,
>
> You can do both, translating part of your program into mata and
> running it in parallel fashion. I recommend you to start by rewriting
> the subloop, it could be something like this (is just an example),
> following the previous example that I sent
>
> ______________ MATA ______________
> cap mata: mata drop myfun()
> mata:
> real colvector myfun(real scalar reps, real colvector y, real colvector z) {
> real scalar i, tsum
> real colvector x
> for(i=1;i<=reps;i++) {
> x = z :+ rnormal(rows(z),1,0,1)
> y = y :+ ((x):^2):/ sum(x)
> }
> return(y)
> }
> end
> ______________ MATA ______________
>
>
> ______________ STATA ______________
> forvalues k=`=id[1]'(1)`=id[_N]'{
>
> gen subs=(id<=`k')   // Define the subsample to be used
> gen Y`k'=0           // gen the intermediate Y`k'
>
> mata: st_store(1::`k', "Y`k'", myfun(\$reps, st_data(1::`k', "Y`k'"),
> st_data(1::`k',"z")))
>
> replace Y`k'=Y`k'/\$reps      // average Y from the 100 simulations
> replace Y= Y + Y`k'
> drop Y`k' subs
> }
> ______________ STATA ______________
>
> Mata is fast, take this for example, two algorithms that generate the
> same thing take quite different times to finish
>
> ____________________________
> timer clear
> // Stata loop + Substata loop
> timer on 1
> forval j = 1/1000 {
> forval i = 1/1000 {
> scalar x = sqrt(exp(1/c(pi))) + sqrt(exp(1/c(pi)))
> }
> }
> timer off 1
>
> // equals 2.3450392
> di x
>
> // Stata loop + mata loop
> timer on 2
> forval j = 1/1000 {
> mata: for(i=1;i<=1000;i++) x = sqrt(exp(1/c("pi"))) + sqrt(exp(1/c("pi")))
> }
> timer off 2
>
> // equals 2.3450392
> mata:x
>
> // In this example, the second version took ~ 1/12 of the time the first did
> . timer list
>    1:      8.32 /        1 =       8.3240
>    2:      0.72 /        1 =       0.7220
> ____________________________
>
> You will save a lot of time!
>
> Best regards,
>
> George Vega Yon
> 7 647 2552
>
>
> 2013/5/23 Nick Cox <njcoxstata@gmail.com>:
>> Thanks for confirmation of mistakes.
>>
>> So, here is an extra comment as you are concerned about cleaning up
>> your code and speeding it up. I can see no reason at all for copying
>> variable -epsilon- to matrix -epsilon- using -mkmat-. You have a
>> variable -epsilon- that should be fine for your purpose. In fact that
>> was I suggested yesterday.
>>
>> If you post revised, complete and correct code you may get further
>> comments, but no definite promises from me.
>> Nick
>> njcoxstata@gmail.com
>>
>>
>> On 23 May 2013 09:01, Luis Aguiar <stataluis@gmail.com> wrote:
>>> Hi George and Nick,
>>>
>>> Thanks a lot for your responses.
>>>
>>> Nick: Yes, the code works but there are indeed a couple of mistakes in
>>> the one I copied. I actually use the command " mkmat epsilon if
>>> id<=`reps' " after generating the variable epsilon (id is indeed _n in
>>> a variable). The second line in your comment (b) should read " replace
>>> x`i'=z + epsilon[`i',1] if id==`k'  " . Sorry about all that. Thanks
>>> my code along with George's comments in order to speed things up.
>>>
>>> George: Thanks for your suggestions as well. I wasn't sure if it would
>>> be worth going into mata, but I will try it now. Your parallel code
>>> seems very interesting too. Do you think it would go faster than using
>>> mata?
>>>
>>> Again, thanks a lot to both of you!
>>>
>>> Cheers,
>>> Luis
>>>
>>> 2013/5/22 Nick Cox <njcoxstata@gmail.com>:
>>>> Please use your full real name. See Statalist FAQ for that request and why.
>>>>
>>>> Some speed-ups are likely to be possible here, but I first I note
>>>> several puzzles with this code.
>>>>
>>>> You don't say so, but presumably -id- is _n in a variable.
>>>>
>>>> -z- is unexplained.
>>>>
>>>> More problematically,
>>>>
>>>> (a) You -generate- a variable -epsilon- but you refer to a matrix -epsilon-.
>>>>
>>>> (b) The two lines below won't work as once -x`i'- exists the second
>>>> command will fail.
>>>>
>>>>                 gen x`i'=z \\ generate simulated variable
>>>>                 gen x`i'=z + epsilon[`i',1] if id==`k' \\ Add the random part
>>>>
>>>> Why do you say that it works? Did you copy some buggy version by accident?
>>>>
>>>> Better to post self-contained code that works.
>>>>
>>>> Speed-ups, apart from using Mata. (See George Vega Yon's post.)
>>>>
>>>> 1. Too much copying from one variable to another. I could be wrong,
>>>> but some variables appear to be mostly zero, and you are just copying
>>>> constants. Think in terms of scalars instead.
>>>>
>>>> 2. Use -summarize, meanonly- to get sums. -egen- is very slow at this.
>>>>
>>>> 3. Use -in 1/`k'- or -in `k'- wherever possible. Whenever there is a
>>>> choice between -if- and -in- for the same problem, -in- is faster.
>>>>
>>>> Some example code:
>>>>
>>>>   gen x`i'=z
>>>>   replace x`i'= x`i' + epsilon[`i'] in `k'
>>>>
>>>>   su x`i' in 1/`k', meanonly
>>>>   replace Y`k' = (x`i')^2/r(sum) in `k'
>>>>
>>>>
>>>> Nick
>>>> njcoxstata@gmail.com
>>>>
>>>>
>>>> On 22 May 2013 18:25, Luis <stataluis@gmail.com> wrote:
>>>>> Dear statalist  users,
>>>>>
>>>>> I am running into a "loop efficiency problem" in that I have to
>>>>> construct a variable using many iterations and I am not sure whether I
>>>>> am being as efficient as possible. Given the number of observations
>>>>> that I have and with my current code, I have to wait days for my code
>>>>> to finish running! Here's my problem:
>>>>>
>>>>> I have a total of 50000 observations and need to construct a variable
>>>>> Y that will be computed using different subsamples of these
>>>>> observations. In particular,
>>>>> Y=Y1 when the subsample contains only the first observation,
>>>>> Y=Y2 when the subsample contains observations 1 and 2,
>>>>> Y=Y3 when the subsample contains observations 1, 2 and 3 etc until
>>>>> Y=Y50000.
>>>>>
>>>>> The idea is therefore to loop over the sample and define the subsample
>>>>> which contains observations 1 until k and construct the variable
>>>>> Y`k'=Yk if id==k and Y`k'=0 if id!=k. Then sum the variables Y`k'
>>>>> after each loop to end up with the final variable Y.
>>>>>
>>>>> To further complicate things, the variable Y needs to be the average
>>>>> of 100 simulations that depend on draws taken from a normal
>>>>> distribution. Hence I need to do a loop within the initial loop in
>>>>> order to do the 100 simulations.
>>>>>
>>>>> My code therefore looks like this:
>>>>>
>>>>> _____________________________________________________________________________________
>>>>>
>>>>> gen Y=0
>>>>>
>>>>> local reps=100 \\ define the number of simulations
>>>>>
>>>>> gen epsilon=rnormal() \\ generate the random var for the simulations
>>>>>
>>>>> forvalues k=1(1)50000{
>>>>>
>>>>> gen subs=(id<=`k')   \\ Define the subsample to be used
>>>>> gen Y`k'=0      \\ gen the intermediate Y`k'
>>>>>
>>>>>         forvalues i=1(1)`reps'{
>>>>>
>>>>>                 gen x`i'=z \\ generate simulated variable
>>>>>                 gen x`i'=z + epsilon[`i',1] if id==`k' \\ Add the random part
>>>>>
>>>>>         gen t=(x`i')^2
>>>>>         bysort subs: egen tsum=sum(x`i')
>>>>>
>>>>>         gen Y_`i'=t/tsum if id ==`k' \\ Construct Y for simulation i
>>>>>         replace Y_`i'=0 if id!=`k'
>>>>>
>>>>>         replace Y`k'=Y`k' + Y_`i'
>>>>>                 replace Y`k'=0 if id!=`k'
>>>>>
>>>>>         drop Y_`i' t tsum x`i'
>>>>>         }
>>>>>
>>>>> replace Y`k'=Y`k'/`reps'      // average Y from the 100 simulations
>>>>> replace Y= Y + Y`k'
>>>>> drop Y`k' subs
>>>>>         }
>>>>>
>>>>> ____________________________________________________________________________________
>>>>>
>>>>>
>>>>> The code runs fine, but I takes a lot of time since it has to
>>>>> construct 100 variables for each of the 50000 iterations. I have tried
>>>>> many different possibilities and I can't think of another way of
>>>>> constructing Y.
>>>>>
>>>>> Any tip or suggestion that would help improve the efficiency of my
>>>>> code would be greatly appreciated!!!
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```