Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: How are values substituted in a -forval- statement?


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: How are values substituted in a -forval- statement?
Date   Fri, 29 Aug 2003 11:28:40 +0100

Deborah Garvey
 
>> I'm running Intercooled Stata 7.0 for Windows 98 (born 11 
> June 2002)
>> and in the process of making a mistake, discovered that I 
> don't understand
>> something fundamental about how Stata evaluates the value 
> list in a
>> -forval- statement.
>> 
>> I wanted to associate previous year's personal income 
> with current year
>> spending in an 11-year panel dataset (shaped wide), so I wrote the
>> following -forval- statement:
>>   
>>  forval i=75/85 {
>>   gen double peryt`i+1' = pery`i';
>>    };
>> 
>> The code assigned peryt* = pery* and did not increment 
> peryt by one year
>> (see example below).
>> 
>> Question:  what did peryt`i+1' mean to Stata?
>> 
>>        state      Arizona
>> pery75     11908000
>>       pery76     13370000      pery77     14871000      pery78
>> 17586000
>>       pery79     20973000      pery80     24057000      pery81
>> 27559000
>>       pery82     29068000      pery83     31916000      pery84
>> 36900000
>>       pery85     40900000      pery86     44900000     peryt75
>> 11908000
>>      peryt76     13370000     peryt77     14871000     peryt78
>> 17586000
>>      peryt79     20973000     peryt80     24057000     peryt81
>> 27559000
>>      peryt82     29068000     peryt83     31916000     peryt84
>> 36900000
>>      peryt85     40900000
>> 

Steven Stillman
 
> I'll let someone else directly answer your question on your 
> current code.
> My suggestion is that this is much easier to do if you 
> reshape your data
> into long form so it looks like:
> 
> state 	year  	pery
> AZ	75	#
> AZ	76	#
> etc.
> CO	75	#
> CO	76 	#
> etc.
> 
> You can then easily do what you want to do:  (1) tsset 
> state year, (2) gen
> peryt = l.perty

I agree with Steven. Nevertheless Deborah's question on 
-forvalues- remains. 

It is really a question on local macros, I think, 
and quite independent of -forvalues-. 

This is my take, subject as usual to correction
and clarification from Stata Corp technical staff. 

Short answer
============

`i+1' is not legal syntax, either in Stata 7 or in 
Stata 8. 

In fact, and perhaps surprisingly, it gets treated exactly as 
it were `i'. 

What you want could have been obtained by `= `i' + 1' 
That is legal but undocumented in Stata 7 and legal 
and documented in Stata 8.

Long answer
===========

When you invoke -forvalues- the local macro you name -- 
here i -- springs into existence and is initialised 
at 75. As the -forvalues- loop proceeds, it automatically 
is varied in the way you specify: in your case the range 
75/85 causes a loop over the integers 75 to 85, as you 
know. 

Now the crunch: when `i' is 75, you want to refer to 76, 
when `i' is 76 you want to refer to 77, and so on, and 
you have guessed that the syntax for that is `i+1'. 

It isn't, and in one sense this _should_ be illegal 
syntax; illegal, because a local macro name, like any name 
([U] 14.3) cannot contain anything other than a letter, digit 
or underscore, and the plus sign disqualifies the name i+1. 

If you try to define 

local i+1 3 

you will find that you can't define a macro 
called i+1. That's because of the same name rule. 
What happens in this case is a consequence of the fact 
that Stata is working at a lower level, reading your code 
character by character and trying to make sense of it. 

The whole issue needs, in fact, to be understood 
at a microscale, thinking character by character, 
and not at a larger, hmm, macroscale, thinking of 
your code in terms of what it means to you. 

Let's take this last example first, and try to 
think of it closer to the way Stata sees it: 

Stata starts out expecting a command name, which 
must be followed by at least one space, and 
the string "local " certainly qualifies. 

i 

is then perfectly acceptable as the start of 
a local macroname 

+ 

is not acceptable, so Stata reasons: the local 
macro name must have ended; it is just "i"; 
what follows can only be the contents of the macro. 

Now, in that context, Stata works with various rules, 
including evaluation of expressions like +1 and ignoring 
spaces which are not explicitly given within 
" ". The consequence, which may surprise you, is that 

local i+1 3

is equivalent to 

local i +13

or 

local i 13 

i.e. we end up with a local macro containing "13". 

Two principles to carry forward are these: 

1. Stata looks at your code character by character. 

2. At any point it has (a range of) expectations: 
depending on those expectations certain characters may 
even get ignored. 

In fact, you're very used to part of 2. already. 
You know it makes no difference whether you say 

di 2 + 2 
di 2+2 
di 2+ 2 

etc. -- and most of the time that is a real feature --
although you might be surprised by the results of 

di 2 + 2 3 

If you are surprised, it is because of your assigning meanings 
to that expression, which Stata knows nothing about, 
either in that context or indeed in any other. 

Stata knows rather little about what its own expressions
mean to itself, and nothing whatsoever about what they 
mean to you. It would get an E on semantics, but an A on 
its own syntax. 

Back now to `i+1'. In the line

gen double peryt`i+1' = pery`i' 

Stata sees, grouping characters, and skipping 
over a lot of detailed reasoning, 

"gen "        OK -- command 

"double "     OK -- variable type makes sense given -gen- 

"peryt"       OK -- start of new variable name 

`             OK -- start of local macro name 

i             OK -- legal character within macro name 

+             not OK -- illegal character within macro name, 
       	  but I'm going to ignore it; 
              the legal part of name has ended, so I am just
      	  going to wait until I see the matching `
'             OK -- end of local macro name

What has Stata got? `i' after ignoring "+1", and that
Stata knows about. 

At this point a very fair question would be _why_ does 
Stata ignore such characters within macro names? 
Why doesn't it squawk? 

I don't know, and there at least two plausible explanations:

1. This is really a bug, or at least an unintended side-effect
of the way parsing is implemented. 

2. This is really a feature. In other examples, it can be 
seen that given 

local lue 42 

di `lue' 
di `lue '
di `lue  '

all produce the same result, so trailing spaces are
ignored, whereas the same isn't true of leading spaces. 
Perhaps somebody from Stata Corp can say more.

For what you want, there is a way to do it on the fly: 

gen double peryt`=`i' + 1' = pery`i' 

The `= together signal that an expression follows, 
to be evaluated. This syntax was in fact introduced in Stata 7, 
but not, I believe, documented at the time. 

Another way to do it would be 

local j = `i' + 1
gen double peryt`j' = pery`i' 

There is more on local macros and on stuff like `=`i' + 1' 
in Stata Journal 2(2), 202-222 (2002) and 3(2), 185-202 (2003). 

Nick 
n.j.cox@durham.ac.uk 
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index