## Stata 15 help for sample

```
[D] sample -- Draw random sample

Syntax

sample # [if] [in] [, count by(groupvars)]

by is allowed; see [D] by.

Statistics > Resampling > Draw random sample

Description

sample draws random samples of the data in memory.  "Sampling" here is
defined as drawing observations without replacement; see [R] bsample for
sampling with replacement.

The size of the sample to be drawn can be specified as a percentage or as
a count:

sample without the count option draws a #% pseudorandom sample of the
data in memory, thus discarding (100 - #)% of the observations.

sample with the count option draws a #-observation pseudorandom
sample of the data in memory, thus discarding _N - # observations.  #
can be larger than _N, in which case all observations are kept.

In either case, observations not meeting the optional if and in criteria
are kept (sampled at 100%).

If you are interested in reproducing results, you must first set the
random-number seed; see [R] set seed.

Options

count specifies that # in sample # be interpreted as an observation count
rather than as a percentage.  Typing sample 5 without the count
option means that a 5% sample be drawn; typing sample 5, count,
however, would draw a sample of 5 observations.

Specifying # as greater than the number of observations in the
dataset is not considered an error.

by(groupvars) specifies that a #% sample be drawn within each set of
values of groupvars, thus maintaining the proportion of each group.

count may be combined with by(). For example, typing
sample 50, count by(sex) would draw a sample of size 50 for men and
50 for women.

Specifying by varlist: sample # is equivalent to specifying
sample #, by(varlist); use whichever syntax you prefer.

Examples

---------------------------------------------------------------------------
Setup
. webuse nlswork

Describe the data
. describe, short

Draw a 10% sample
. sample 10

Describe the resulting data
. describe, short

---------------------------------------------------------------------------
Setup
. webuse nlswork, clear

Create a one-way table of frequency counts
. tab race

Keep 100% of race != 1 women, but only 10% of race = 1 women
. sample 10 if race == 1

---------------------------------------------------------------------------
Setup
. webuse nlswork, clear

Keep 10% of each of the three categories of race
. sample 10, by(race)

---------------------------------------------------------------------------
Setup
. webuse nlswork, clear

Draw a sample of 2,500
. sample 2500, count

Describe the resulting data
. describe, short
---------------------------------------------------------------------------

```