Stata 15 help for statsby

[D] statsby -- Collect statistics for a command across a by list


statsby [exp_list] [, options ]: command

options Description ------------------------------------------------------------------------- Main * by(varlist [, missing]) equivalent to interactive use of by varlist:

Options clear replace data in memory with results saving(filename, ...) save results to filename; save statistics in double precision; save results to filename every # replications total include results for the entire dataset subsets include all combinations of subsets of groups

Reporting nodots suppress replication dots dots(#) display dots every # replications noisily display any output from command trace trace command nolegend suppress table legend verbose display the full table legend

Advanced basepop(exp) restrict initializing sample to exp; seldom used force do not check for svy commands; seldom used forcedrop retain only observations in by-groups when calling command; seldom used ------------------------------------------------------------------------- * by() is required on the dialog box because statsby is useful to the interactive user only when using by(). All weight types supported by command are allowed except pweights; see weight.


Statistics > Other > Collect statistics for a command across a by list


statsby collects statistics from command across a by list. Typing

. statsby exp_list , by(varname): command

executes command for each group identified by varname, building a dataset of the associated values from the expressions in exp_list. The resulting dataset replaces the current dataset, unless the saving() option is supplied. varname can refer to a numeric or a string variable.

command defines the statistical command to be executed. Most Stata commands and user-written programs can be used with statsby, as long as they follow standard Stata syntax and allow the if qualifier. The by prefix cannot be part of command.

exp_list specifies the statistics to be collected from the execution of command. The expressions in exp_list follow the grammar given in exp_list. If no expressions are given, exp_list assumes a default depending upon whether command changes results in e() and r(). If command changes results in e(), the default is _b. If command changes results in r() (but not e()), the default is all the scalars posted to r(). It is an error not to specify an expression in exp_list otherwise.


+------+ ----+ Main +-------------------------------------------------------------

by(varlist [, missing]) specifies a list of existing variables that would normally appear in the by varlist: section of the command if you were to issue the command interactively. By default, statsby ignores groups in which one or more of the by() variables is missing. Alternatively, missing causes missing values to be treated like any other values in the by-groups, and results from the entire dataset are included with use of the subsets option. If by() is not specified, command will be run on the entire dataset. varlist can contain both numeric and string variables.

+---------+ ----+ Options +----------------------------------------------------------

clear specifies that it is okay to replace the data in memory, even though the current data have not been saved to disk.

saving(filename [, suboptions]) creates a Stata data file (.dta file) consisting of (for each statistic in exp_list) a variable containing the replicates.

See help prefix_saving_option, for details about suboptions.

total specifies that command be run on the entire dataset, in addition to the groups specified in the by() option.

subsets specifies that command be run for each group defined by any combination of the variables in the by() option.

+-----------+ ----+ Reporting +--------------------------------------------------------

nodots suppresses display of the replication dots. By default, one dot character is printed for each by-group. A red `x' is printed if command returns with an error or one of the values in exp_list is missing.

dots(#) displays dots every # replications. dots(0) is a synonym for nodots.

noisily causes the output of command to be displayed for each by-group. This option implies the nodots option.

trace causes a trace of the execution of command to be displayed. This option implies the noisily option.

nolegend suppresses the display of the table legend, which identifies the rows of the table with the expressions they represent.

verbose requests that the full table legend be displayed. By default, coefficients and standard errors are not displayed.

+----------+ ----+ Advanced +---------------------------------------------------------

basepop(exp) specifies a base population that statsby uses to evaluate the command and to set up for collecting statistics. The default base population is the entire dataset, or the dataset specified by any if or in conditions specified on the command.

One situation where basepop() is useful is collecting statistics over the panels of a panel dataset by using an estimator that works for time series, but not panel data, for example,

. statsby, by(mypanels) basepop(mypanels==2): arima ...

force suppresses the restriction that command not be a svy command. statsby does not perform subpopulation estimation for survey data, so it should not be used with svy. statsby reports an error when it encounters svy in command if the force option is not specified. This option is seldom used, so use it only if you know what you are doing.

forcedrop forces statsby to drop all observations except those in each by-group before calling command for the group. This allows statsby to work with user-written programs that completely ignore if and in but do not return an error when either is specified. forcedrop is seldom used.

Example: Collecting coefficients

. sysuse auto . statsby, by(foreign): regress mpg gear turn

Example: Collecting both coefficients and standard errors using a time-series e > stimator with panel data

. webuse grunfeld, clear . tsset company year . statsby _b _se, basepop(company==1) by(company): arima invest mvalue kstock, ar(1)

Example: Collecting results stored in r-class macros

. sysuse auto, clear . statsby mean=r(mean) sd=r(sd) size=r(N), by(rep78): summarize mpg

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index