Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: mata: passing additional arguments to functions passed to functions


From   wgould@stata.com (William Gould, Stata)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: mata: passing additional arguments to functions passed to functions
Date   Thu, 08 Jun 2006 09:37:03 -0500

Matt Weinber <mweinber@gmail.com> has written Mata function -qnewsolve()- to
find roots of nonlinear equations using a quasi-newton's method.  Among other
things, the function receives function f() -- the function for which the roots
are to be found -- and now he would like to pass those arguments to f().  That
is, -qnewsolve()- would also receive arguments a1, a2, ..., an, and then,
inside, it would call f(x, a1, a2, ..., an).

So what's the difficulty?  The difficulty is that the number of 
arguments will vary along with f().

Here is Matt's function as it currently stands, which doesn't exactly
work, but illustrates what he wants to do:

        numeric matrix qnewsolve(pointer matrix f, numeric matrix x, 
                        real scalar crit, real scalar itmax,
                        real matrix a1, real matrix a2, ..., real matrix an)
        {
                delta=1e-6
                alpha=1e-3
                nv=max((rows(x),cols(x)))
                tvec=delta*I(nv)
                done=0
                f0=(*f)(x) /* HERE I WANT TO ALSO GIVE f a1,a2,...,an*/
                af0=colsum(abs(f0))
                af00=af0
                itct=0
                while(!done){
                        grad = ((*f)(x*J(1,nv,1)+tvec)-(f0)*(J(1,nv,1)))/delta
                        /*HERE TOO*/
                        /*a bunch of operations...*/
                         if (itct>=itmax) {
                                    done=1
                                    rc=4
                        }
                        else if (af0<crit) {
                                 done=1
                                 rc=0
                         }
                }
                return(x\rc)
        }


I have a solution to offer.  It will require adding a couple of subroutines,
but it will keep the main code looking nice.

Here is Matt's original code, and I'll Ive done is delete parts in order to
highlight the relevant parts, and I have made a couple of minor improvements
which I will explain.  I have not made the substantive changes yet:

        numeric matrix qnewsolve(pointer(function) scalar f, numeric matrix x, 
                        real scalar crit, real scalar itmax,
                        a1, a2, a3, a4, a5)
        {
                ...
                f0=(*f)(x) /* HERE I WANT TO ALSO GIVE f a1,a2,...,an*/
                ...
                        grad = ((*f)(x*J(1,nv,1)+tvec)-(f0)*(J(1,nv,1)))/delta
                ...
        }

All I have done so far is

    1.  Change a1, a2, ..., an to a1, a2, a3, a4, a5.  Let's agree 
        on a maximum of five arguments for f().  

    2.  Change the declaration of a1, a2, ..., a5 each from being a 
        -real matrix- to being nothing, equivalent to -transmorphic-.
        That's better because there is no reason to require that the 
        f()'s extra arguments be real or be matrices.  What they are 
        is f()'s business, not ours.

    3.  Change the declaration of f from -pointer matrix f- to 
        pointer(function) scalar f-.  This is not important, but 
        I want (for documentation and understanding purposes) to emphasize
        that f is a pointer to a function and that f itself is a scalar.

That above is still Matt's code.

Now here's my substantive change.  I have marked changed or new lines 
with an X on the left:

        numeric matrix qnewsolve(pointer(function) scalar f, numeric matrix x, 
                        real scalar crit, real scalar itmax,
X1                      |a1, a2, a3, a4, a5)
        {
X2              transmorphic   p

X3              p = callf_setup(f, args()-4, a1, a2, a3, a4, a5)


                ...
X4              f0 = callf(p, x)
                ...
X5                      grad = (callf(x*J(1,nv,1)+tvec)-f0*J(1,nv,1))/delta
                ...
        }

Here's what I did:

    X1.  Note the | in front of a1.  I made arguments a1, a2, a3, a4, a5
         optional.  The user can specify none, or a1, or a1 and a2, or ...

    X2.  I need an extra variable; I'm calling it p.

    X3.  I need to do a setup.  -callf_setup()- (supplied below) is going 
         to return p.  It's a secret to -qnewsolve()- what -callf_setup()-
         puts in p, but I'll tell you that p is going to contain the address
         of user's function &f(), the number of arguments, and pointers to the
         arguments.

    X4.  Where Matt coded -(*f)(x)-, I now code -callf(p, x)-.
         -callf()- is the second new subroutine I will need to write.

    X5.  Same commend as X4.

Finally, here are my two new subroutines:


        pointer vector callf_setup(pointer(function scalar) f, real scalar n, 
                                   a1, a2, a3, a4, a5)
        {
                pointer vector     p
                real scalar        mycopy

                p = J(1, 5+2, .)

                p[1] = f
                mycopy = n ; p[2] = &mycopy
                p[3] = &a1 ; p[4] = &a2 ; p[5] = &a3 ; p[6] = &a4 ; p[7] = &a5

                return(p)
        }

        transmorphic callf(pointer vector p, real matrix x)
        {
                real scalar        n

                n = *(p[2])
                if (n==0) return( (*p[1])(x) )

                if (n==1) return( (*p[1])(x, *p[3]) )

                if (n==2) return( (*p[1])(x, *p[3], *p[4]) )

                if (n==3) return( (*p[1])(x, *p[3], *p[4], *p[5]) )

                if (n==4) return( (*p[1])(x, *p[3], *p[4], *p[5], *p[6]) 

                return( (*p[1])(x, *p[3], *p[4], *p[5], *p[6], *p[7]) )
        }

-callf()- itself may not seem particularly elegant to you, but it works, and 
I suppose there is a certain elegance in that.  That the code inside is not
elegant is irrelevant.  -callf()- itself, viewed from the outside, is elegant.

Here's how it all works.  -callf_setup()- returns p, a 7-element 
pointer vector containing, 

              *p[1]         the user function f()
              *p[2]         the number of extra arguments
              *p[3]         the 1st extra argument 
              *p[4]         the 2nd extra argument
              ..
              *p[7]         the 5th extra argument.

-callf()- calls f() (i.e., *p[1]) and obtains from the rest of p whatever else
it needs to pass along to f().

There is only one part that is tricky in all of this.  In -callf_setup()-, 
the part of the code that fills in p[2] reads 

                mycopy = n ; p[2] = &mycopy

Why not just -p[2] = &n-?  Because n was passed to -callf_setup()-, and the
caller might later change, n unbeknowst to -callf()-.  Then *p[2] would
change, too.  -callf_setup()- wants to be sure that *p[2] continues to record
the number of arguments.  -callf_setup()-'s solution is to make its own copy
of n, and record the address of that.

That complication was caused because -callf_setup()- made p a pointer vector, 
so everytrhing in p[] had to be a pointer, so it could not directly record 
n, it had to record an address.  A more elegant solution would have made p a
structure.  Why would it be more elegant?  Only becuse it would be easier to
read and understand the code.  In my opinion, the previous solution does not
exceed the threshhold where structures are actually necessary, but I never
object to added clarity.

Here's how the two routines (plus structure definition) could look:

       struct callf_forp {
               pointer(function) scalar    f
               real scalar                 n
               pointer scalar              p1
               pointer scalar              p2
               pointer scalar              p3
               pointer scalar              p4
               pointer scalar              p5
        }

        struct callf_forp scalar callf_setup(
                          pointer(function scalar) f, 
                          real scalar n, a1, a2, a3, a4, a5)
        {
                struct callf_forp scalar    p

                p.f  = f
                p.n  = n
                p.p1 = &a1 ; p.p2 = &a2 ; p.p3 = &a3 ; p.p4 = &a4 ; p.p5 = &a5
        }
 
        transmorphic callf(struct callf_forp p, real matrix x)
        {
                if (p.n==0) return( (*p.f)(x) )

                if (p.n==1) return( (*p.f)(x, *p.p1) )

                if (p.n==2) return( (*p.f)(x, *p.p1, *p.p2) )

                if (p.n==3) return( (*p.f)(x, *p.p1, *p,p3, *p.p4) )

                if (p.n==4) return( (*p.f)(x, *p.p1, *p.p2, *p.p3, *p.p4) 

                return( (*p.f)(x, *p.p1, *p.p2, *p.p3, *p.p4, *p.p5) )
        }

Either version of -callf_setup()- and -callf()- will work equally well.

I want to direct your attention all the way back to the improved (but hardly
changed) -qnewsolve()- and, in particular, the lines marked X2 and X3:

                transmorphic   p

                p = callf_setup(f, args()-4, a1, a2, a3, a4, a5)

What I want you to notice is that, in -qnewsolve()-, I merely declared p to be
-transmorphic-.  

In my first version of -callf_setup()-, p is in fact a pointer vector.  In my
second version, p is a struct callf_forp scalar.

By declaring p to be transmorphic, it does not matter what p in fact is.  Nor
is it any business of the caller (-qnewsolve()-) what p is.  From the caller's
point of view, it is to obtain p from -callf_setup()- and pass p to -callf()-.

That is very, very good style because it completes the separation of
-qnewsolve()- from the solution.  The solution is provided by the -callf()-
system and we, the programmers of that system, are free to change the internal
workings of the system without having to tell, or recompile, any routines that
use it.

-- Bill
wgould@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index