Re: st: (Il)Legal variable/macro names?

From   n j cox <>
Subject   Re: st: (Il)Legal variable/macro names?
Date   Thu, 25 Oct 2007 18:11:14 +0100

This is mostly for StataCorp, but I'll comment.

I think Mark has been bitten by a bug; the question
is where is the bug.

1. Is it that -tempvar- allows a name that is really

2. Is it whatever caused the statement that failed
to recognise a legal macro name? (Apparently,
a parser limitation.)

StataCorp will decide which it is. A wild guess is
that it will be much easier to fix -tempvar- and
-tempname- to disallow names like Mark's than
to ensure that names like his work everywhere
they might be used -- on all versions of Stata
on all platforms in all circumstances.

Either way, there is now a small mystery on
exactly what characters are really allowed within names.

I make a distinction:

1. As a Stata user, I want StataCorp to do the maximum
possible to let me use whatever characters I need for _labelling output_. Typically, I try hard to use correct spelling, including
accents, wherever appropriate in variable labels, value labels
and graph annotation (not to mention the old question of mathematical symbols and Greek characters). I trust that is not controversial or
disagreeable. I am much less fussed about characters in (permanent)
variable names. That may well, naturally, be much more
important to people using languages more accented than English.

2. As a Stata programmer, I am happy to accept a very limited
character set A-Z a-z 0-9 _ for macro names. It would be
interesting to hear arguments to the opposite effect in
addition to Mark's want.


Schaffer, Mark E

I've just been bitten by an odd inconsistency between what constitutes a legal name for a variable and a legal name for a macro. 8-bit ascii characters are apparently legal in variable names, but when used in a macro name, no macro is created.

Here's an example using the auto dataset. The first part shows that the variable name uŻ is legal. The second part shows that when I try to use -tempvar- to create a macro called "uŻ", nothing is created - when Stata gets to the next line, macro substitution means `uŻ' becomes ... nothing.

. do "C:\DOCUME~1\MARKSC~1\LOCALS~1\Temp\STD0l000000.tmp"

. sysuse auto, clear
(1978 Automobile Data)

. capture program drop legalnames

. program define legalnames
1. gen uŻ = mpg
2. sum uŻ
3. tempvar uŻ
4. gen `uŻ' = mpg
5. sum `uŻ'
6. end

. set trace on

. legalnames
----------------------------------------------- begin legalnames ---
- gen uŻ = mpg
- sum uŻ

Variable | Obs Mean Std. Dev. Min Max
uŻ | 74 21.2973 5.785503 12 41
- tempvar uŻ
- gen `uŻ' = mpg
= gen = mpg
too few variables specified
------------------------------------------------- end legalnames ---

end of do-file

I can't find anything about this in the manuals, but the behavior of -tempvar- does look bug-like - if an illegal macro name is used, shouldn't -tempvar- complain?

In programs I sometimes generate macro names based on variable names, so if the naming rules are actually different for variable names and macro names, this is not a good strategy.

