help by, help bysort
-------------------------------------------------------------------------------
Title
[D] by -- Repeat Stata command on subsets of the data
[U] 11 Language syntax
Syntax
by varlist: stata_cmd
bysort varlist: stata_cmd
The above diagrams show by and bysort as they are typically used. The
full syntax of the commands is
by varlist1 [(varlist2)] [, sort rc0]: stata_cmd
bysort varlist1 [(varlist2)] [, rc0]: stata_cmd
Description
Most Stata commands allow the by prefix, which repeats the command for
each group of observations for which the values of the variables in
varlist are the same. by without the sort option requires that the data
be sorted by varlist; see [D] sort.
Stata commands that work with the by prefix indicate this immediately
following their syntax diagram by reporting, for example, "by is allowed;
see [D] by" or "bootstrap, by, etc., are allowed; see prefix".
by and bysort are really the same command; bysort is just by with the
sort option.
The varlist1 (varlist2) syntax is of special use to programmers. It
verifies that the data are sorted by varlist1 varlist2 and then performs
a by as if only varlist1 were specified. For instance,
by pid (time): gen growth = (bp - bp[_n-1])/bp
performs the generate by values of pid but first verifies that the data
are sorted by pid and time within pid.
Options
sort specifies that if the data are not already sorted by varlist, by
should sort them.
rc0 specifies that even if the stata_cmd produces an error in one of the
by-groups, then by is still to run the stata_cmd on the remaining
by-groups. The default action is to stop when an error occurs. rc0
is especially useful when stata_cmd is an estimation command and some
by-groups have insufficient observations.
Examples
---------------------------------------------------------------------------
Setup
. sysuse auto
For each category of foreign, display summary statistics for rep78
. by foreign: summarize rep78
Same as above command, but check that the data are sorted by foreign and
make within foreign
. by foreign (make): summarize rep78
not sorted
r(5);
. sort foreign make
. by foreign (make): summarize rep78
For each category of rep78, display frequency counts of foreign
. by rep78: tabulate foreign
not sorted
r(5);
. sort rep78
. by rep78: tabulate foreign
Equivalent to above two commands
. by rep78, sort: tabulate foreign
Equivalent to above command
. bysort rep78: tabulate foreign
For each category of rep78 within categories of foreign, display summary
statistics for price
. by foreign rep78, sort: summarize price
---------------------------------------------------------------------------
Setup
. sysuse autornd
. keep in 1/20
Store in new variable mean_w the mean value of weight for each category
of mpg
. by mpg, sort: egen mean_w = mean(weight)
---------------------------------------------------------------------------
Also see
Manual: [D] by
Help: [P] byable, [P] foreach, [P] forvalues, [D] sort, [D] statsby,
[P] while