Matt Weinber <mweinber@gmail.com> has written Mata function -qnewsolve()- to
find roots of nonlinear equations using a quasi-newton's method. Among other
things, the function receives function f() -- the function for which the roots
are to be found -- and now he would like to pass those arguments to f(). That
is, -qnewsolve()- would also receive arguments a1, a2, ..., an, and then,
inside, it would call f(x, a1, a2, ..., an).
So what's the difficulty? The difficulty is that the number of
arguments will vary along with f().
Here is Matt's function as it currently stands, which doesn't exactly
work, but illustrates what he wants to do:
numeric matrix qnewsolve(pointer matrix f, numeric matrix x,
real scalar crit, real scalar itmax,
real matrix a1, real matrix a2, ..., real matrix an)
{
delta=1e-6
alpha=1e-3
nv=max((rows(x),cols(x)))
tvec=delta*I(nv)
done=0
f0=(*f)(x) /* HERE I WANT TO ALSO GIVE f a1,a2,...,an*/
af0=colsum(abs(f0))
af00=af0
itct=0
while(!done){
grad = ((*f)(x*J(1,nv,1)+tvec)-(f0)*(J(1,nv,1)))/delta
/*HERE TOO*/
/*a bunch of operations...*/
if (itct>=itmax) {
done=1
rc=4
}
else if (af0<crit) {
done=1
rc=0
}
}
return(x\rc)
}
I have a solution to offer. It will require adding a couple of subroutines,
but it will keep the main code looking nice.
Here is Matt's original code, and I'll Ive done is delete parts in order to
highlight the relevant parts, and I have made a couple of minor improvements
which I will explain. I have not made the substantive changes yet:
numeric matrix qnewsolve(pointer(function) scalar f, numeric matrix x,
real scalar crit, real scalar itmax,
a1, a2, a3, a4, a5)
{
...
f0=(*f)(x) /* HERE I WANT TO ALSO GIVE f a1,a2,...,an*/
...
grad = ((*f)(x*J(1,nv,1)+tvec)-(f0)*(J(1,nv,1)))/delta
...
}
All I have done so far is
1. Change a1, a2, ..., an to a1, a2, a3, a4, a5. Let's agree
on a maximum of five arguments for f().
2. Change the declaration of a1, a2, ..., a5 each from being a
-real matrix- to being nothing, equivalent to -transmorphic-.
That's better because there is no reason to require that the
f()'s extra arguments be real or be matrices. What they are
is f()'s business, not ours.
3. Change the declaration of f from -pointer matrix f- to
pointer(function) scalar f-. This is not important, but
I want (for documentation and understanding purposes) to emphasize
that f is a pointer to a function and that f itself is a scalar.
That above is still Matt's code.
Now here's my substantive change. I have marked changed or new lines
with an X on the left:
numeric matrix qnewsolve(pointer(function) scalar f, numeric matrix x,
real scalar crit, real scalar itmax,
X1 |a1, a2, a3, a4, a5)
{
X2 transmorphic p
X3 p = callf_setup(f, args()-4, a1, a2, a3, a4, a5)
...
X4 f0 = callf(p, x)
...
X5 grad = (callf(x*J(1,nv,1)+tvec)-f0*J(1,nv,1))/delta
...
}
Here's what I did:
X1. Note the | in front of a1. I made arguments a1, a2, a3, a4, a5
optional. The user can specify none, or a1, or a1 and a2, or ...
X2. I need an extra variable; I'm calling it p.
X3. I need to do a setup. -callf_setup()- (supplied below) is going
to return p. It's a secret to -qnewsolve()- what -callf_setup()-
puts in p, but I'll tell you that p is going to contain the address
of user's function &f(), the number of arguments, and pointers to the
arguments.
X4. Where Matt coded -(*f)(x)-, I now code -callf(p, x)-.
-callf()- is the second new subroutine I will need to write.
X5. Same commend as X4.
Finally, here are my two new subroutines:
pointer vector callf_setup(pointer(function scalar) f, real scalar n,
a1, a2, a3, a4, a5)
{
pointer vector p
real scalar mycopy
p = J(1, 5+2, .)
p[1] = f
mycopy = n ; p[2] = &mycopy
p[3] = &a1 ; p[4] = &a2 ; p[5] = &a3 ; p[6] = &a4 ; p[7] = &a5
return(p)
}
transmorphic callf(pointer vector p, real matrix x)
{
real scalar n
n = *(p[2])
if (n==0) return( (*p[1])(x) )
if (n==1) return( (*p[1])(x, *p[3]) )
if (n==2) return( (*p[1])(x, *p[3], *p[4]) )
if (n==3) return( (*p[1])(x, *p[3], *p[4], *p[5]) )
if (n==4) return( (*p[1])(x, *p[3], *p[4], *p[5], *p[6])
return( (*p[1])(x, *p[3], *p[4], *p[5], *p[6], *p[7]) )
}
-callf()- itself may not seem particularly elegant to you, but it works, and
I suppose there is a certain elegance in that. That the code inside is not
elegant is irrelevant. -callf()- itself, viewed from the outside, is elegant.
Here's how it all works. -callf_setup()- returns p, a 7-element
pointer vector containing,
*p[1] the user function f()
*p[2] the number of extra arguments
*p[3] the 1st extra argument
*p[4] the 2nd extra argument
..
*p[7] the 5th extra argument.
-callf()- calls f() (i.e., *p[1]) and obtains from the rest of p whatever else
it needs to pass along to f().
There is only one part that is tricky in all of this. In -callf_setup()-,
the part of the code that fills in p[2] reads
mycopy = n ; p[2] = &mycopy
Why not just -p[2] = &n-? Because n was passed to -callf_setup()-, and the
caller might later change, n unbeknowst to -callf()-. Then *p[2] would
change, too. -callf_setup()- wants to be sure that *p[2] continues to record
the number of arguments. -callf_setup()-'s solution is to make its own copy
of n, and record the address of that.
That complication was caused because -callf_setup()- made p a pointer vector,
so everytrhing in p[] had to be a pointer, so it could not directly record
n, it had to record an address. A more elegant solution would have made p a
structure. Why would it be more elegant? Only becuse it would be easier to
read and understand the code. In my opinion, the previous solution does not
exceed the threshhold where structures are actually necessary, but I never
object to added clarity.
Here's how the two routines (plus structure definition) could look:
struct callf_forp {
pointer(function) scalar f
real scalar n
pointer scalar p1
pointer scalar p2
pointer scalar p3
pointer scalar p4
pointer scalar p5
}
struct callf_forp scalar callf_setup(
pointer(function scalar) f,
real scalar n, a1, a2, a3, a4, a5)
{
struct callf_forp scalar p
p.f = f
p.n = n
p.p1 = &a1 ; p.p2 = &a2 ; p.p3 = &a3 ; p.p4 = &a4 ; p.p5 = &a5
}
transmorphic callf(struct callf_forp p, real matrix x)
{
if (p.n==0) return( (*p.f)(x) )
if (p.n==1) return( (*p.f)(x, *p.p1) )
if (p.n==2) return( (*p.f)(x, *p.p1, *p.p2) )
if (p.n==3) return( (*p.f)(x, *p.p1, *p,p3, *p.p4) )
if (p.n==4) return( (*p.f)(x, *p.p1, *p.p2, *p.p3, *p.p4)
return( (*p.f)(x, *p.p1, *p.p2, *p.p3, *p.p4, *p.p5) )
}
Either version of -callf_setup()- and -callf()- will work equally well.
I want to direct your attention all the way back to the improved (but hardly
changed) -qnewsolve()- and, in particular, the lines marked X2 and X3:
transmorphic p
p = callf_setup(f, args()-4, a1, a2, a3, a4, a5)
What I want you to notice is that, in -qnewsolve()-, I merely declared p to be
-transmorphic-.
In my first version of -callf_setup()-, p is in fact a pointer vector. In my
second version, p is a struct callf_forp scalar.
By declaring p to be transmorphic, it does not matter what p in fact is. Nor
is it any business of the caller (-qnewsolve()-) what p is. From the caller's
point of view, it is to obtain p from -callf_setup()- and pass p to -callf()-.
That is very, very good style because it completes the separation of
-qnewsolve()- from the solution. The solution is provided by the -callf()-
system and we, the programmers of that system, are free to change the internal
workings of the system without having to tell, or recompile, any routines that
use it.
-- Bill
wgould@stata.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/