Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Cutting out the middle macro [was: RE: Using a scalar/macro for loop limit ...]


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: Cutting out the middle macro [was: RE: Using a scalar/macro for loop limit ...]
Date   Tue, 4 Nov 2008 19:14:28 -0000

Thomas Jacobs reported that he was unable to figure out how to use
either 
a scalar or a local macro for the maximum index of a loop. In
particular, 
he had tried for a loop over companies identified by -Companies-

summ Companies
scalar Count = r(max)
forvalues i = 1/Count {
	<stuff> 
} 

But that triggered an "invalid syntax" error. 

Maarten Buis and Martin Weiss between them pointed out that the
immediate fix here -- in code that uses a scalar -- is 

forvalues i = 1/`=Count' { 

while if you use a local the code could be 

summ Companies 
local Count = r(max) 
forvalues i = 1/`Count' { 
	<stuff> 
} 

There was then some discussion about their relative merits. In essence,
if the number is really, really big a scalar is preferable to a local,
but almost always that difference won't bite. It seems safe to say that
it won't bite with numbers of companies. 

But for the problem specified my answer to the question "Scalar or
local?" is "Neither". 

Take a step back and consider what you are asking Stata to do. 

1. Run -summarize- and as a wanted side-effect put the maximum in
r(max). 

2. Put the value of r(max) in a local or scalar. 

3. Loop using -forvalues-, picking up the maximum for the loop index
from where it is stored. 

But 2. is unnecessary. You can go direct from 1. to 3. Examples first,
explanation later. You can do this 

summ Companies
forvalues i = 1/`r(max)' {
	<stuff> 
}

or this 

summ Companies
forvalues i = 1/`=r(max)' {
	<stuff> 
}

It is largely a matter of style which you choose, except that StataCorp
could advise on which is a smidgen of a smidgen faster. 

In essence r(max) can be thought of having a local macro persona
`r(max)'. Alternatively you can invoke the usual way of evaluating an
expression on the fly, i.e. r(max) is an expression (which happens to be
a single term) and `=r(max)' evaluates it on the fly so that -forvalues-
never sees r(max), just its value. 

This technique has been called "cutting out the middle macro". 

I did say pedantically "for the problem specified". If you need the
value of r(max) for something later in the code, then you need to store
it somehow and a local will be useful. But that's not explicit in this
problem and in my experience it is more common than not that you don't
need the intermediate storage in something else. 

A separate twist is that whenever you want the maximum only it is more
efficient to use -summarize, meanonly-. Despite its not very well chosen
name, the -meanonly- option will calculate the maximum. StataCorp
developers themselves often miss this nuance. 

See also 

FAQ     . . . . . . . . . . Making foreach go through all values of a
variable
        8/05    Is there a way to tell Stata to try all values of a
                particular variable in a foreach statement without
                specifying them?
                http://www.stata.com/support/faqs/data/foreach.html

where this technique is used in exactly the same context. 

Nick 
[email protected] 




*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index