# Re: st: mata: why this loop is so slow

 From wgould@stata.com (William Gould, Stata) To statalist@hsphsun2.harvard.edu Subject Re: st: mata: why this loop is so slow Date Tue, 30 May 2006 09:58:19 -0500

```AbdelRahmen <uaquap@gnet.tn> writes,

> I've written this portion of code (see below) and it works
> perfectly the problem is the execution time which is about 15
> mn. Under SAS (IML) my co-author code the same thing and the
> execution time is 1 (one) minute on 35900 observations. is
> there a problem with my programmation Style?

He then attaches the Mata and IML code.

I do spot on inefficiency in his Mata code, but there are enough other
questions I have to make me uncertain that the inefficiency I spotted
is the cause.

The Mata code reads,

w = st_data(., tokens(st_local("instrume")))    // matrix
q = st_data(., tokens(st_local("vprobit")))     // matrix
dmills=st_data(., "dmills")                     // colvector
s=st_data(., "selection")                       // colvector
t=st_data(., "t")                               // colvector

chw=cols(w)                                     // scalar
cq=cols(q)                                      // scalar

for (i=1; i<=nt; i++) {
jacobi=J(chw, (cq*T) ,0)
jacobi[(chw - T + t[i]), (cq*t[i] -cq+1)..cq*t[i]] =
q[i,.]:*dmills[i,.]
F = F + (s[i,.]:*h[i,.]')*beta'*jacobi
}

F=(1/n)*F

The above does not work because:

1.  -F- is unitialized.  I assume AbdelRahmen defined -F = 0- before the
start of the loop.

2.  -nt- is not defined.  I assume the defintion was -nt = rows(w)-.

3.  -T- is not defined; I have no idea what it should be.

4.  -h[]- is not defined; I have no idea what it should be.

5.  -beta- is not defined.

Putting all that aside, the one inefficiency I spotted is

jacobi[(chw - T + t[i]), (cq*t[i] -cq+1)..cq*t[i]] =
q[i,.]:*dmills[i,.]

which would be better coded

i1 = (chw - T + t[i])
jacobi[| i1, cq(t[i]-cq+1 \ i1, cq*t[i] |] =
q[i,.]:*dmills[i,.]

This is more similar to how the IML code read, too.

However, when I compare the IML code to the Mata code, I see lots of
other differences, making me wonder whether they are coding the same
solution.

The IML code reads,

dmills=-mill1#(qpi+mill1)#q;
do i=1 to N;
jacobi=j(chw,cq#T,0);
period=t[i];
jacobi[chw-T+period,cq#(period-1)+1:cq#period]=dmills[i,];
F=F+h[i,]`*(beta`*jacobi);
F=(F/N)

I have performed the indentation, what AbdelRahmen supplied looked like this,
dmills=-mill1#(qpi+mill1)#q;
do i=1 to N;
jacobi=j(chw,cq#T,0);
period=t[i];
jacobi[chw-T+period,cq#(period-1)+1:cq#period]=dmills[i,];
F=F+h[i,]`*(beta`*jacobi);
F=(F/N)

In any case, the -do- is not closed!  I assume the actual IML code reads,

dmills=-mill1#(qpi+mill1)#q;
do i=1 to N;
jacobi=j(chw,cq#T,0);
period=t[i];
jacobi[chw-T+period,cq#(period-1)+1:cq#period]=dmills[i,];
F=F+h[i,]`*(beta`*jacobi);
end;
F=(F/N)

but I am not certain of that.  If the end were omitted, might IML have just
run the loop on the first statement?  That would certainly explain a difference
in run times.

Then I look at the line to update -jacobi[]-.  In the IML code, it is just
a matrix row:  -dmills[i,]-.  In the Mata code, it is a calculation
-q[i,.]:*dmills[i,.]-.

Anyway, there are enough differences that I cannot be sure what is going
on.

-- Bill
wgould@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```