Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: using the same macro in program & mata


From   "[email protected]" <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: using the same macro in program & mata
Date   Tue, 8 Jul 2008 08:39:05 -0700

Maria <[email protected]> writes,

> I am writing a program that calls Mata and then, within Mata, calls a
> Stata command.  Currently, what I have written to do this looks
> something like:
>
>         program name
>           global depvar `1'
>           mata : temp_("$depvar")
>           .....
>
>         mata:
>         void temp_(string scalar dep)
>         {
>                 ....
>                 stata("reg $depvar xvar")
>                 ....
>         }

Maria goes on to say, "The program does what I would like, however, I read in
the Stata Manual ([U]18.3.10) that one should never use global macros [...]"

I agree with that statement.

Solution 1A
-----------

One solution would be,

        program name
          local depvar `1'
          mata: temp_("`depvar'")

        mata:
        void temp_(string scalar dep)
        {
                ...
                stata("reg " + dep + " xvar")
                ...
        }

Thte important thing to notice in this solution is that the string
scalar dep contains the name of the dependent variable, not the
name of a macro containing the name of the dependent variable.

The line
                stata("reg " + dep + " xvar")

could alternatively be coded

                stata(sprintf("reg %s xvar", dep))

It does not matter which coding you use.



Solution 1B
-----------

In fact, we don't need any macros at all.  We could simplify the above to
read

        program name
          mata: temp_("`1'")

        mata:
        void temp_(string scalar dep)
        {
                ...
                stata("reg " + dep + " xvar")
                ...
        }


Solution 2
----------

Alternatively, we could write our code to pass and receive a macro name,
which macro contained the name of the dependent variable:

        program name
          local depvar `1'
          mata: temp_("depvar")

        mata:
        void temp_(string scalar dep)
        {
                ...
                stata("reg " + st_local(dep) + " xvar")
                ...
        }

In this solution string scalar dep does not contain, say, "mpg", the name
of the dependent variable.  It contains "depvar", and the local macro
depvar contains "mpg".


Discussion
----------

Solution 1, A or B, is better than Solution 2 because that solution is
simplier.

There is a case, however, where Solution 2 might be preferred if
execution speed were of a great concern.

Let's pretend we do not want to pass a single variable name to our
Mata program, but we want to pass a string containing multiple
variable names, e.g., the string might be "weight foreign".  Except,
let's imagine this is a case where there might be 5,000 variable names
and each might be 32 characters long, so the total length of our string
could be 33*5,000-1 = 164,999 characters.  Let's assume local macro vars
contains these 5,000 variable names.

We code,

            mata: mysub("`vars'")

to send the 5,000 names, and we code

        void temp_(string scalar variables)

to receive them.  Now think of what Stata has to do to execute what
we have coded.

     1.  Execute -mata: mysub("`vars'")-

         a.  Expand -mata: mysub("`vars'"), meaning substitute
             a 164,999 character string for vars in the above.

         b.  Interpret the 164,999+a_little_bit long string, understand
             that it means to call Mata.

         c.  Call Mata.

     2.  Mata now sees a 164,999+a_little_bit long string, and Mata
         compiles the string.

         a.  Mata finds a 164,999+a_little_bit long string as a
             a string literal, and compiles (copies) the string
             into the program.

         b.  Mata execute the compiled code.  Along the way, there is
             a step that copies the 164,999+a_little_bit long string
             from the program into a string scalar called variables.

We copied the string a lot; three times by my count.  Usually, that
will not matter in terms of execution time, but this string is long.
Well, 164,999+a_little_bit is not that long by the standards of modern
computers.  So let's pretend the string is 1,000,000 characters long.
And let's pretend we are doing this in a loop.  All this copying is a
waste of time.

We could instead code,

            mata: mysub("vars")           // don't expand vars

to send the 5,000 names, and we could code

        void temp_(string scalar vars_mac_name)

to recieve them and then, where we needed the names, we could code
st_local(vars_mac_name) to actually obtain them.

That would save computer time.

I don't want to exaggerate the savings because, executed once, it
will not be much.  It is so little that you would have difficult
time measuring it.  But in some instances such as a simulation running
thousands of times, it would be worth saving the time.

Our rule at StataCorp for writing code is that, in general, we pass the
thing itself -- let Stata expand it and Mata receive it the simple way.
In the case, however, where you are passing a varlist and execution time
might be of issue and the varlist might be long, pass instead the name
of the macro containing the long string and then expand the string yourself
using st_local() where you need it.

My rule, a variation on the StataCorp rule, is to pass macro names in
cases where the varlist might be every variable in the dataset.  In
all other cases, I let Stata do the expansion and keep my Mata code
simple.

I have a second rule that goes with all my rules:  I write simple,
easy-to-read code and later, if I discover I have a performance problem,
I go back and modify.  This rule is based on the experience that I
usually expect more performance problems that I actually turn out to have.

-- Bill
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Privileged, confidential or patient identifiable information may be contained in this message. This information is meant only for the use of the intended recipients. If you are not the intended recipient, or if the message has been addressed to you in error, do not read, disclose, reproduce, distribute, disseminate or otherwise use this transmission. Instead, please notify the sender by reply e-mail, and then destroy all copies of the message and any attachments.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index