**[R] dstdize** -- Direct and indirect standardization

__Syntax__

Direct standardization

**dstdize** *charvar* *popvar* *stratavars* [*if*] [*in*]**,** **by(***groupvars***)**
[*dstdize_options*]

Indirect standardization

**istdize** *casevar_s* *popvar_s* *stratavars* [*if*] [*in*] **using** *filename***,**
{__pop__**vars(***casevar_p popvar_p***)** |
**rate(***ratevar_p *{*#*|*crudevar_p*}**)**} [*istdize_options*]

*charvar* is the characteristic to be standardized across different
subpopulations identified by *groupvars*.

*popvar* defines the weights used in standardization.

*stratavars* defines the strata across which the weights are to be averaged
in **dstdize**. For **istdize**, *stratavars* defines the strata for which
*casevar_s* is measured.

*casevar_s* is the variable name for the study population's number of
cases. If **by(***groupvars***)** is specified, *casevar_s* must be constant or
missing within each group defined by combinations of *groupvars*.

*popvar_s* identifies the number of subjects in each strata in the study
population.

*filename* must be a Stata dataset and contain *popvar* and *stratavars*.

*dstdize_options* Description
-------------------------------------------------------------------------
Main
* **by(***groupvars***)** study populations
__us__**ing(***filename***)** use standard population from Stata
dataset
__ba__**se(***#*|*string***)** use standard population from a value of
grouping variable
__l__**evel(***#***)** set confidence level; default is
**level(95)**

Options
__sav__**ing(***filename***)** save computed standard population
distribution as a Stata dataset
__f__**ormat(***%fmt***)** final summary table display format;
default is **%10.0g**
__pr__**int** include table summary of standard
population in output
**nores** suppress storing results in **r()**
-------------------------------------------------------------------------
* **by(***groupvars***)** is required.

*istdize_options* Description
-------------------------------------------------------------------------
Main
* __pop__**vars(***casevar_p popvar_p***)** for standard population, *casevar_p* is
number of cases and *popvar_p* is number
of individuals
* **rate(***ratevar_p *{*#*|*crudevar_p*}**)** *ratevar_p* is stratum-specific rates and
*#* or *crudevar_p* is the crude case rate
value or variable
__l__**evel(***#***)** set confidence level; default is
**level(95)**

Options
**by(***groupvars***)** variables identifying study populations
__f__**ormat(***%fmt***)** final summary table display format;
default is **%10.0g**
__pr__**int** include table summary of standard
population in output
-------------------------------------------------------------------------
* Either **popvars(***casevar_p popvar_p***)** or **rate(***ratevar_p *{*#*|*crudevar_p*}**)**
must be specified.

__Menu__

__dstdize__

**Statistics > Epidemiology and related > Other > Direct**
**standardization**

__istdize__

**Statistics > Epidemiology and related > Other > Indirect**
**standardization**

__Description__

**dstdize** produces standardized rates, a weighted average of the
stratum-specific rates.

**istdize** produces indirectly standardized rates that are appropriate when
the stratum-specific rates for the population being studied are either
unavailable or unreliable.

**istdize** also calculates a point estimate and exact confidence interval
for the study population's standardized mortality ratio (SMR) or the
standardized incidence ratio.

__Options for dstdize__

+------+
----+ Main +-------------------------------------------------------------

**by(***groupvars***)** is required for the **dstdize** command; it specifies the
variables identifying the study populations. If **base()** is also
specified, there must be only one variable in the **by()** group. If you
do not have a variable for this option, you can generate one by using
something like **generate newvar=1** and then use **newvar** as the argument
to this option.

**using(***filename***)** or **base(***#*|*string***)** may be used to specify the standard
population. You may not specify both options. **using(***filename***)**
supplies the name of a **.dta** file containing the standard population.
The standard population must contain the *popvar* and the *stratavars*.
If **using()** is not specified, the standard population distribution
will be obtained from the data. **base(***#*|*string***)** lets you specify one
of the values of *groupvar* -- either a numeric value or a string -- to
be used as the standard population. If neither **base()** nor **using()** is
specified, the entire dataset is used to determine an estimate of the
standard population.

**level(***#***)** specifies the confidence level, as a percentage, for a
confidence interval of the adjusted rate. The default is **level(95)**
or as set by **set level**.

+---------+
----+ Options +----------------------------------------------------------

**saving(***filename***)** saves the computed standard population distribution as a
Stata dataset that can be used in further analyses.

**format(***%fmt***)** specifies the format in which to display the final summary
table. The default is **%10.0g**.

**print** includes a table summary of the standard population before
displaying the study population results.

**nores** suppresses storing results in **r()**. This option is seldom
specified. Some results are stored in matrices. If there are more
groups than **matsize**, **dstdize** will report "matsize too small". Then
you can either increase **matsize** or specify **nores**. The **nores** option
does not change how results are calculated but specifies that results
need not be left behind for use by other programs.

__Options for istdize__

+------+
----+ Main +-------------------------------------------------------------

**popvars(***casevar_p popvar_p***)** or **rate(***ratevar_p *{*#*|*crudevar_p*}**)** must be
specified with **istdize**. Only one of these two options is allowed.
These options are used to describe the standard population's data.

With **popvars(***casevar_p popvar_p***)**, *casevar_p* records the number of
cases (deaths) for each stratum in the standard population, and
*popvar_p* records the total number of individuals in each stratum
(individuals at risk).

With **rate(***ratevar_p *{*#*|*crudevar_p*}**)**, *ratevar_p* contains the
stratum-specific rates. *#*|*crudevar_p* specifies the crude case rate
either by a variable name or by the crude case rate value. If a
crude rate variable is used, it must be the same for all
observations, although it could be missing for some.

**level(***#***)** specifies the confidence level, as a percentage, for a
confidence interval of the adjusted rate. The default is **level(95)**
or as set by **set level**.

+---------+
----+ Options +----------------------------------------------------------

**by(***groupvars***)** specifies variables identifying study populations when more
than one exists in the data. If this option is not specified, the
entire study population is treated as one group.

**format(***%fmt***)** specifies the format in which to display the final summary
table. The default is **%10.0g**.

**print** outputs a table summary of the standard population before
displaying the study population results.

__Examples__

---------------------------------------------------------------------------
Setup
**. webuse hbp**
**. generate pop = 1**

Obtain standardized rates of **hbp** by **city** and **year**, using the **age**, **race**,
and **sex** distribution of the cities and years combined as the standard
**. dstdize hbp pop age race sex, by(city year)**

---------------------------------------------------------------------------
Setup
**. webuse kahn, clear**

Obtain mortality rates by **state** using the standard population saved in
**popkahn.dta**
**. istdize death pop age using**
**http://www.stata-press.com/data/r15/popkahn,** **by(state) pop(deaths**
**pop) print**
---------------------------------------------------------------------------

__Stored results__

**dstdize** stores the following in **r()**:

Scalars
**r(k)** number of populations

Macros
**r(by)** variable names specified in **by()**
**r(c***#***)** values of **r(by)** for *#*th group

Matrices
**r(se)** 1 x k vector of standard errors of adjusted rates
**r(ub_adj)** 1 x k vector of upper bounds of confidence
intervals for adjusted rates
**r(lb_adj)** 1 x k vector of lower bounds of confidence
intervals for adjusted rates
**r(Nobs)** 1 x k vector of number of observations
**r(crude)** 1 x k vector of crude rates (*)
**r(adj)** 1 x k vector of adjusted rates (*)
(*) If, in a group, the number of observations is
0, then 9 is stored for the corresponding crude
and adjusted rates.

**istdize** stores the following in **r()**:

Scalars
**r(k)** number of populations

Macros
**r(by)** variable names specified in **by()**
**r(c***#***)** values of **r(by)** for *#*th group

Matrices
**r(cases_obs)** 1 x k vector of number of observed cases
**r(cases_exp)** 1 x k vector of number of expected cases
**r(ub_adj)** 1 x k vector of upper bounds of confidence
intervals for adjusted rates
**r(lb_adj)** 1 x k vector of lower bounds of confidence
intervals for adjusted rates
**r(crude)** 1 x k vector of crude rates
**r(adj)** 1 x k vector of adjusted rates
**r(smr)** 1 x k vector of SMRs
**r(ub_smr)** 1 x k vector of upper bounds of confidence
intervals for SMRs
**r(lb_smr)** 1 x k vector of lower bounds of confidence
intervals for SMRs