**[D] by** -- Repeat Stata command on subsets of the data

__Syntax__

**by** *varlist***:** *stata_cmd*

__bys__**ort** *varlist***:** *stata_cmd*

The above diagrams show **by** and **bysort** as they are typically used. The
full syntax of the commands is

**by** *varlist**1* [**(***varlist**2***)**] [**,** __s__**ort** **rc0**]**:** *stata_cmd*

__bys__**ort** *varlist**1* [**(***varlist**2***)**] [**,** **rc0**]**:** *stata_cmd*

__Description__

Most Stata commands allow the **by** prefix, which repeats the command for
each group of observations for which the values of the variables in
*varlist* are the same. **by** without the **sort** option requires that the data
be sorted by *varlist*; see **[D] sort**.

Stata commands that work with the **by** prefix indicate this immediately
following their syntax diagram by reporting, for example, "**by** is allowed;
see **[D] by**" or "**bootstrap**, **by**, etc., are allowed; see prefix".

**by** and **bysort** are really the same command; **bysort** is just **by** with the
**sort** option.

The *varlist1* **(***varlist2***)** syntax is of special use to programmers. It
verifies that the data are sorted by *varlist1 varlist2* and then performs
a **by** as if only *varlist1* were specified. For instance,

**by pid (time): generate growth = (bp - bp[_n-1])/bp**

performs the **generate** by values of **pid** but first verifies that the data
are sorted by **pid** and **time** within **pid**.

__Options__

**sort** specifies that if the data are not already sorted by *varlist*, **by**
should sort them.

**rc0** specifies that even if the *stata_cmd* produces an error in one of the
by-groups, then **by** is still to run the *stata_cmd* on the remaining
by-groups. The default action is to stop when an error occurs. **rc0**
is especially useful when *stata_cmd* is an estimation command and some
by-groups have insufficient observations.

__Examples__

---------------------------------------------------------------------------
Setup
**. sysuse auto**

For each category of **foreign**, display summary statistics for **rep78**
**. by foreign: summarize rep78**

Same as above command, but check that the data are sorted by **foreign** and
**make** within **foreign**
**. by foreign (make): summarize rep78**
not sorted
r(5);
**. sort foreign make**
**. by foreign (make): summarize rep78**

For each category of **rep78**, display frequency counts of **foreign**
**. by rep78: tabulate foreign**
not sorted
r(5);
**. sort rep78**
**. by rep78: tabulate foreign**

Equivalent to above two commands
**. by rep78, sort: tabulate foreign**

Equivalent to above command
**. bysort rep78: tabulate foreign**

For each category of **rep78** within categories of **foreign**, display summary
statistics for **price**
**. by foreign rep78, sort: summarize price**

---------------------------------------------------------------------------
Setup
**. sysuse autornd**
**. keep in 1/20**

Store in new variable **mean_w** the mean value of **weight** for each category
of **mpg**
**. by mpg, sort: egen mean_w = mean(weight)**
---------------------------------------------------------------------------

__Technical note__

**by** repeats the *stata_cmd* for each group defined by *varlist*. If *stata_cmd*
stores results, only the results from the last group on which *stata_cmd*
executes will be stored.