Stata 15 help for bysort

[D] by -- Repeat Stata command on subsets of the data


by varlist: stata_cmd

bysort varlist: stata_cmd

The above diagrams show by and bysort as they are typically used. The full syntax of the commands is

by varlist1 [(varlist2)] [, sort rc0]: stata_cmd

bysort varlist1 [(varlist2)] [, rc0]: stata_cmd


Most Stata commands allow the by prefix, which repeats the command for each group of observations for which the values of the variables in varlist are the same. by without the sort option requires that the data be sorted by varlist; see [D] sort.

Stata commands that work with the by prefix indicate this immediately following their syntax diagram by reporting, for example, "by is allowed; see [D] by" or "bootstrap, by, etc., are allowed; see prefix".

by and bysort are really the same command; bysort is just by with the sort option.

The varlist1 (varlist2) syntax is of special use to programmers. It verifies that the data are sorted by varlist1 varlist2 and then performs a by as if only varlist1 were specified. For instance,

by pid (time): generate growth = (bp - bp[_n-1])/bp

performs the generate by values of pid but first verifies that the data are sorted by pid and time within pid.


sort specifies that if the data are not already sorted by varlist, by should sort them.

rc0 specifies that even if the stata_cmd produces an error in one of the by-groups, then by is still to run the stata_cmd on the remaining by-groups. The default action is to stop when an error occurs. rc0 is especially useful when stata_cmd is an estimation command and some by-groups have insufficient observations.


--------------------------------------------------------------------------- Setup . sysuse auto

For each category of foreign, display summary statistics for rep78 . by foreign: summarize rep78

Same as above command, but check that the data are sorted by foreign and make within foreign . by foreign (make): summarize rep78 not sorted r(5); . sort foreign make . by foreign (make): summarize rep78

For each category of rep78, display frequency counts of foreign . by rep78: tabulate foreign not sorted r(5); . sort rep78 . by rep78: tabulate foreign

Equivalent to above two commands . by rep78, sort: tabulate foreign

Equivalent to above command . bysort rep78: tabulate foreign

For each category of rep78 within categories of foreign, display summary statistics for price . by foreign rep78, sort: summarize price

--------------------------------------------------------------------------- Setup . sysuse autornd . keep in 1/20

Store in new variable mean_w the mean value of weight for each category of mpg . by mpg, sort: egen mean_w = mean(weight) ---------------------------------------------------------------------------

Technical note

by repeats the stata_cmd for each group defined by varlist. If stata_cmd stores results, only the results from the last group on which stata_cmd executes will be stored.

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index