*Note: This FAQ is for users of Stata 7. It is not relevant for more
recent versions. The following material is based on an exchange that started
on Statalist.*

Title | Stata 7: Making for go through all values of a variable | |

Author | Nicholas J. Cox, Durham University, UK | |

Date | August 2001 |

**for** offers one way of repeating one or more Stata commands; see [R]
**for**. One common pattern is to cycle through all values of a
classifying variable. Thus, with the auto data, we could cycle through all
the values of **foreign** or **rep78**:

. for num 0 1 :whateverif foreign == X. for num 1/5 :whateverif rep78 == X

The question asks for a way to go through all values without specifying them. In practice, this could be useful if you do not know all of the values at the time you type the command; you are writing code to be used with different sets of values; or you know the values but wish to avoid typing a long list.

This FAQ covers three ways of doing it. The first always works, the second usually works, and the third is brute force but is occasionally defensible.

Users of Stata 7 will find that the ideas here are also pertinent to
applications of **foreach** and **forvalues**. See [P] **foreach**
and [P] **forvalues**.

What always works is this three-step process:

. egen group = group(varname)

This maps the distinct values of *varname* to 1, 2, 3, up to the number
of distinct values.

. su group, meanonly

Among other things, this second step leaves behind in **r(max)** the
number of distinct values; see [R] **summarize** for details on saved
results.

Then we use

. for num 1 / `r(max)' :whateverif group == X

Note that the r-class result **r(max)** is being used immediately after
it was produced by **summarize**. As r-class results are ephemeral and
tend not to persist, this is recommended.

That is three lines, but it has two advantages.

- It works for all kinds of variables (integer, other numeric and string).
- It extends easily to the apparently much more difficult problem of
going all through all the distinct combinations of two or more variables,
which is, in fact, only a little more difficult.
**. egen group = group(***varlist***)****. su group, meanonly****. for num 1 / `r(max)' :***whatever***if group == X***varname*, you just need to spell out a*varlist*in the argument to**egen**.

In STB-60 there is a command called **vallist**, which has just one aim
in life--to produce a list of the distinct values in a variable. See the
on-line help on **stb** if you wish to know more about the STB (Stata
Technical Bulletin) or about installing programs from the STB. If, with the
**auto** data, you type

. vallist rep78

Stata displays

1 2 3 4 5

and leaves behind those values in **r(list)**. So, you can type

. vallistvarname. for num `r(list)' :whateverifvarname== X

**for** cycles through all of the values fed to it within **r(list)**.
Note again that the r-class result **r(list)** is being used immediately
after it was produced by **vallist**.

Note also that there are some limitations on this method whenever the values
of *varname* have fractional parts. Suppose it took on values such as
1.1 or 2.1. Then there would be problems with conditions such **if**
*varname* **== 1.1** as described at
FAQ: Why canâ€™t I compare two values that I know are equal? or in [U] **16.10
Precision and problems therein**. Fortunately, precision is rarely an
issue in these circumstances, because whenever people want to do this, it is
usually to cycle through categories defined by integer codes.

Suppose you know that some variable takes on most of the integers between 1 and 20, but not necessarily all of them. (It is, say, the number of children in families.) You can make Stata try all of those values and trap cases in which there is no output.

. for num 1/20 : capture noisilywhateverif nchildren == X

Here the **capture** captures any instances in which the command would
fail and crash the **for**. The **noisily** ensures that we still see
output.

This method is crude, but it is a good practical solution whenever it is quicker for Stata to find out for itself which cases will not work than for you to puzzle out more careful code. It could be a very good method if you knew that there were only a few gaps in a sequence.