Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: mata: why this loop is so slow


From   "U.A.QU.AP" <uaquap@gnet.tn>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: mata: why this loop is so slow
Date   Tue, 30 May 2006 18:52:38 +0100 (CET)

Thanks to william Gould,
1) Obviously the F matrix, nt, T, h and beta are defined
before the loop that I've posted.
nt=35900, T=21, beta is a matrix of dim (46*1) and h is
another matrix. 
The IML loop is obviously closed but my previous copy/paste
causes the ommission of the word end.
2) secondly the two code IML and Mata give the same result.
3) Anyway, william Gould  writes
Putting all that aside, the one inefficiency I spotted is

jacobi[(chw - T + t[i]), (cq*t[i] -cq+1)..cq*t[i]] =
                                                 
q[i,.]:*dmills[i,.]

which would be better coded

i1 = (chw - T + t[i])
jacobi[| i1, cq(t[i]-cq+1 \ i1, cq*t[i] |] =                 
                                q[i,.]:*dmills[i,.]

When I take the Bill's suggestion after correcting his little
typo ( your ( sould read * in the jacobi) execution time
decrease by 30 seconds
4) Finally, Bill writes
Then I look at the line to update -jacobi[]-.  In the IML
code, it is just
a matrix row:  -dmills[i,]-.  In the Mata code, it is a
calculation
-q[i,.]:*dmills[i,.]-.
My experimentation shows that this difference between the two
code is irrelevant has no efect on execution time.
AbdelRahmen 


- Original message ---- 
>Date: Tue, 30 May 2006 09:58:19 -0500
>From: wgould@stata.com (William Gould, Stata)  
>Subject: Re: st: mata:  why this loop is so slow  
>To: statalist@hsphsun2.harvard.edu
>
>AbdelRahmen <uaquap@gnet.tn> writes, 
>
>> I've written this portion of code (see below) and it works
>> perfectly the problem is the execution time which is about 15
>> mn. Under SAS (IML) my co-author code the same thing and the
>> execution time is 1 (one) minute on 35900 observations. is
>> there a problem with my programmation Style?
>
>He then attaches the Mata and IML code.
>
>I do spot on inefficiency in his Mata code, but there are
enough other 
>questions I have to make me uncertain that the inefficiency I
spotted 
>is the cause.
>
>The Mata code reads, 
>
>        w = st_data(., tokens(st_local("instrume")))    // matrix
>        q = st_data(., tokens(st_local("vprobit")))     // matrix
>        dmills=st_data(., "dmills")                     //
colvector
>        s=st_data(., "selection")                       //
colvector
>        t=st_data(., "t")                               //
colvector
>
>        chw=cols(w)                                     // scalar
>        cq=cols(q)                                      // scalar
>
>        for (i=1; i<=nt; i++) {
>                jacobi=J(chw, (cq*T) ,0)
>                jacobi[(chw - T + t[i]), (cq*t[i]
-cq+1)..cq*t[i]] = 
>                                                  
q[i,.]:*dmills[i,.]
>                F = F + (s[i,.]:*h[i,.]')*beta'*jacobi
>        }
>
>        F=(1/n)*F
>
>The above does not work because:
>
>    1.  -F- is unitialized.  I assume AbdelRahmen defined -F
= 0- before the 
>        start of the loop.
>
>    2.  -nt- is not defined.  I assume the defintion was -nt
= rows(w)-.
>
>    3.  -T- is not defined; I have no idea what it should be.
>
>    4.  -h[]- is not defined; I have no idea what it should be.
>
>    5.  -beta- is not defined.
>
>Putting all that aside, the one inefficiency I spotted is
>
>                jacobi[(chw - T + t[i]), (cq*t[i]
-cq+1)..cq*t[i]] = 
>                                                  
q[i,.]:*dmills[i,.]
>
>which would be better coded
>
>	        i1 = (chw - T + t[i])
>		jacobi[| i1, cq(t[i]-cq+1 \ i1, cq*t[i] |] = 
>                                                  
q[i,.]:*dmills[i,.]
>
>This is more similar to how the IML code read, too.
>
>However, when I compare the IML code to the Mata code, I see
lots of 
>other differences, making me wonder whether they are coding
the same 
>solution.
>
>The IML code reads, 
>
>        dmills=-mill1#(qpi+mill1)#q;
>        do i=1 to N;   
>                jacobi=j(chw,cq#T,0);
>                period=t[i];
>               
jacobi[chw-T+period,cq#(period-1)+1:cq#period]=dmills[i,];  
>                F=F+h[i,]`*(beta`*jacobi);
>                F=(F/N)
>
>I have performed the indentation, what AbdelRahmen supplied
looked like this,
>        dmills=-mill1#(qpi+mill1)#q;
>        do i=1 to N;   
>        jacobi=j(chw,cq#T,0);
>        period=t[i];
>       
jacobi[chw-T+period,cq#(period-1)+1:cq#period]=dmills[i,];  
>        F=F+h[i,]`*(beta`*jacobi);
>        F=(F/N)
>
>In any case, the -do- is not closed!  I assume the actual IML
code reads, 
>
>        dmills=-mill1#(qpi+mill1)#q;
>        do i=1 to N;   
>                jacobi=j(chw,cq#T,0);
>                period=t[i];
>               
jacobi[chw-T+period,cq#(period-1)+1:cq#period]=dmills[i,];  
>                F=F+h[i,]`*(beta`*jacobi);
>        end;
>        F=(F/N)
>
>but I am not certain of that.  If the end were omitted, might
IML have just
>run the loop on the first statement?  That would certainly
explain a difference
>in run times.
>
>Then I look at the line to update -jacobi[]-.  In the IML
code, it is just 
>a matrix row:  -dmills[i,]-.  In the Mata code, it is a
calculation
>-q[i,.]:*dmills[i,.]-.  
>
>Anyway, there are enough differences that I cannot be sure
what is going 
>on.
>
>-- Bill
>wgould@stata.com
>*
>*   For searches and help try:
>*   http://www.stata.com/support/faqs/res/findit.html
>*   http://www.stata.com/support/statalist/faq
>*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index