Re: st: using the same macro in program & mata

 From mfitzpat@stanford.edu To statalist@hsphsun2.harvard.edu Subject Re: st: using the same macro in program & mata Date Wed, 09 Jul 2008 17:14:02 -0700

Bill,

Thank you very much for your detailed response. I used 1A and it worked perfectly (though I also tried 1B and it worked, too). Solution 2 was more complicated than I needed just know, but I will keep it in my back pocket for the future.

Maria

Quoting "William Gould, StataCorp LP" <wgould@stata.com>:

```Maria <mfitzpat@stanford.edu> writes,

```
```I am writing a program that calls Mata and then, within Mata, calls a
Stata command.  Currently, what I have written to do this looks
something like:

program name
global depvar `1'
mata : temp_("\$depvar")
.....

mata:
void temp_(string scalar dep)
{
....
stata("reg \$depvar xvar")
....
}
```
```Maria goes on to say, "The program does what I would like, however, I read in
the Stata Manual ([U]18.3.10) that one should never use global macros [...]"

I agree with that statement.

Solution 1A
-----------

One solution would be,

program name
local depvar `1'
mata: temp_("`depvar'")

mata:
void temp_(string scalar dep)
{
...
stata("reg " + dep + " xvar")
...
}

Thte important thing to notice in this solution is that the string
scalar dep contains the name of the dependent variable, not the
name of a macro containing the name of the dependent variable.

The line
stata("reg " + dep + " xvar")

could alternatively be coded

stata(sprintf("reg %s xvar", dep))

It does not matter which coding you use.

Solution 1B
-----------

In fact, we don't need any macros at all.  We could simplify the above to

program name
mata: temp_("`1'")

mata:
void temp_(string scalar dep)
{
...
stata("reg " + dep + " xvar")
...
}

Solution 2
----------

Alternatively, we could write our code to pass and receive a macro name,
which macro contained the name of the dependent variable:

program name
local depvar `1'
mata: temp_("depvar")

mata:
void temp_(string scalar dep)
{
...
stata("reg " + st_local(dep) + " xvar")
...
}

In this solution string scalar dep does not contain, say, "mpg", the name
of the dependent variable.  It contains "depvar", and the local macro
depvar contains "mpg".

Discussion
----------

Solution 1, A or B, is better than Solution 2 because that solution is
simplier.

There is a case, however, where Solution 2 might be preferred if
execution speed were of a great concern.

Let's pretend we do not want to pass a single variable name to our
Mata program, but we want to pass a string containing multiple
variable names, e.g., the string might be "weight foreign".  Except,
let's imagine this is a case where there might be 5,000 variable names
and each might be 32 characters long, so the total length of our string
could be 33*5,000-1 = 164,999 characters.  Let's assume local macro vars
contains these 5,000 variable names.

We code,

mata: mysub("`vars'")

to send the 5,000 names, and we code

void temp_(string scalar variables)

to receive them.  Now think of what Stata has to do to execute what
we have coded.

1.  Execute -mata: mysub("`vars'")-

a.  Expand -mata: mysub("`vars'"), meaning substitute
a 164,999 character string for vars in the above.

b.  Interpret the 164,999+a_little_bit long string, understand
that it means to call Mata.

c.  Call Mata.

2.  Mata now sees a 164,999+a_little_bit long string, and Mata
compiles the string.

a.  Mata finds a 164,999+a_little_bit long string as a
a string literal, and compiles (copies) the string
into the program.

b.  Mata execute the compiled code.  Along the way, there is
a step that copies the 164,999+a_little_bit long string
from the program into a string scalar called variables.

We copied the string a lot; three times by my count.  Usually, that
will not matter in terms of execution time, but this string is long.
Well, 164,999+a_little_bit is not that long by the standards of modern
computers.  So let's pretend the string is 1,000,000 characters long.
And let's pretend we are doing this in a loop.  All this copying is a
waste of time.

mata: mysub("vars")           // don't expand vars

to send the 5,000 names, and we could code

void temp_(string scalar vars_mac_name)

to recieve them and then, where we needed the names, we could code
st_local(vars_mac_name) to actually obtain them.

That would save computer time.

I don't want to exaggerate the savings because, executed once, it
will not be much.  It is so little that you would have difficult
time measuring it.  But in some instances such as a simulation running
thousands of times, it would be worth saving the time.

Our rule at StataCorp for writing code is that, in general, we pass the
thing itself -- let Stata expand it and Mata receive it the simple way.
In the case, however, where you are passing a varlist and execution time
might be of issue and the varlist might be long, pass instead the name
of the macro containing the long string and then expand the string yourself
using st_local() where you need it.

My rule, a variation on the StataCorp rule, is to pass macro names in
cases where the varlist might be every variable in the dataset.  In
all other cases, I let Stata do the expansion and keep my Mata code
simple.

I have a second rule that goes with all my rules:  I write simple,
easy-to-read code and later, if I discover I have a performance problem,
I go back and modify.  This rule is based on the experience that I
usually expect more performance problems that I actually turn out to have.

-- Bill
wgould@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

```
```
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```