Title | Stata 7: Problems with for and local or global macros | |
Author | Nicholas J. Cox, Durham University, UK |
I am trying to use the for command. My command line includes local or global macros, and I am not getting the results I expect. What is the explanation?
Many users find the for command useful when their Stata problem is highly repetitive. For example, for may be used to cycle through a variable list, each time substituting a different variable within a command. See [R] for for more information.
However, for sometimes gives puzzling results if the command line contains one or more local or global macros. In fact, its behavior is completely consistent with Stata's general rules, and puzzlement often arises because users have in mind behavior in some other programming or statistical language that is much closer, for example, to the behavior of Stata's while.
Consider this concocted example:
tokenize a b c d e f g h i j for num 1/10: generate str1 SX = "" \ replace SX = string(`X')
which is equivalent to
tokenize a b c d e f g h i j for X in num 1/10: generate str1 SX = "" \ replace SX = string(`X')
After tokenize, the local macros 1 through 10 contain "a" through "j", which we will assume are names of numeric variables.
The main idea of for is to cycle through the arguments supplied in one or more lists, substituting them in turn in one or more commands which follow. Thus for the numlist 1/10, which expands to 1 2 3 4 5 6 7 8 9 10, we first generate a string variable in each case, S1 through S10, and then we replace it with the string equivalent of the variable whose name is in the local macro `X'.
It is this point that can cause confusion. Many users assume that during the execution of for, as X varies from 1 to 10, `X' will become `1' to `10'. However, this is not the way Stata works.
When Stata sees any name of a local macro in a command line, it first tries to substitute the contents of the macro wherever the name occurs. It does this before it attempts to interpret the command, and, if that is successful and the command is legal syntax, before it attempts to execute the command. In this example, Stata will look for the local macro `X' and attempt to substitute it in the command line, all before it tries to make sense of the for command line as a whole.
At this point, two things could happen. First, it is possible you do in fact have a local macro defined as `X', in which case its contents will be substituted now. Let us suppose you have such a local, currently containing the text "42". If so, the line now becomes
for num 1/10: generate str1 SX = "" \ replace SX = string(42)
which happens to make sense but is unlikely to be what you wanted. In other cases, substituting the contents of `X' for its name may make the for command illegal, particularly if the placeholder for the argument (here, X) no longer appears in the command to be repeated (here generate and what follows).
Second, it is possible you do not have a local macro `X' defined, in which case `X' will be replaced by nothing. This is likely to make the command line illegal, as with
for num 1/10: generate str1 SX = "" \ replace SX = string()
Even if the result is legal, it is unlikely to be what you want.
In short, the names of local macros are substituted by their contents once and once only, before the command line in question is checked. This applies to for just as it applies to any other command, and it rules out most attempts to use local macros as placeholders that will take different values in command lines using for. This is also true of global macros.
There are various possible alternatives when macros are to be included repetitively in Stata code. Which one is best depends on the problem being tackled and on what version of Stata you are using.
In Stata 6.0, the most general alternative is to use while. The concocted example here could be tackled using while, which may appear in programs, in do-files, and interactively.
tokenize a b c d e f g h i j local i = 1 while `i' <= 10 { generate str1 S`i' = "" replace S`i' = string(``i'') local i = `i' + 1 }
In this example, the substitutions of macro contents for macro names take place as each line is interpreted. This is the same behavior as before, but with a while construct it is closer to the behavior many users expect from their experience with other software. The reason is there are several command lines here, which are executed by Stata in turn, and the local macro names on each line are replaced by their current contents just before that line is executed. Thus those contents may vary. The code is admittedly less concise and perhaps not so easy to correct interactively as the original code using for, but it is usually much easier to extend to more complicated sequences of operations.
In Stata 7.0, foreach and forvalues are alternatives to while. For example,
tokenize a b c d e f g h i j forvalues i = 1/10 { generate str1 S`i' = "" replace S`i' = string(``i'') }
is more concise than the previous example, as the local macro i is automatically initialized to 1 and incremented each time around the loop, given the numlist 1/10 supplied as range. Otherwise, the principle is the same as with while: the local macro names on each line are replaced by their current contents just before that line is executed. Stata 7.0 users may see [P] foreach or [P] forvalues for more information.