__Title__

**[SVY] svy: tabulate twoway** -- Two-way tables for survey data

__Syntax__

Basic syntax

**svy:** __tab__**ulate** *varname1* *varname2*

Full syntax

**svy** [*vcetype*] [**,** *svy_options*] **:** __tab__**ulate** *varname1* *varname2* [*if*] [*in*]
[**,** *tabulate_options* *display_items* *display_options*
*statistic_options*]

Syntax to report results

**svy** [**,** *display_items* *display_options* *statistic_options*]

*vcetype* Description
-------------------------------------------------------------------------
SE
__linear__**ized** Taylor-linearized variance estimation
**bootstrap** bootstrap variance estimation; see **[SVY] svy**
**bootstrap**
**brr** BRR variance estimation; see **[SVY] svy brr**
__jack__**knife** jackknife variance estimation; see **[SVY] svy**
**jackknife**
**sdr** SDR variance estimation; see **[SVY] svy sdr**
-------------------------------------------------------------------------
Specifying a *vcetype* overrides the default from **svyset**.

*svy_options* Description
-------------------------------------------------------------------------
if/in
__sub__**pop(**[*varname*] [*if*]**)** identify a subpopulation

SE
*bootstrap_options* more options allowed with bootstrap variance
estimation
*brr_options* more options allowed with BRR variance
estimation
*jackknife_options* more options allowed with jackknife variance
estimation
*sdr_options* more options allowed with SDR variance
estimation
-------------------------------------------------------------------------
**svy** requires that the survey design variables be identified using **svyset**;
see **[SVY] svyset**.
See **[SVY] svy postestimation** for features available after estimation.
Warning: Using **if** or **in** restrictions will often not produce correct
variance estimates for subpopulations. To compute estimates for
subpopulations, use the **subpop()** option.

*tabulate_options* Description
-------------------------------------------------------------------------
Model
__std__**ize(***varname***)** variable identifying strata for standardization
__stdw__**eight(***varname***)** weight variable for standardization
**tab(***varname***)** variable for which to compute cell
totals/proportions
__miss__**ing** treat missing values like other values
-------------------------------------------------------------------------

*display_items* Description
-------------------------------------------------------------------------
Table items
__cel__**l** cell proportions
__cou__**nt** weighted cell counts
__col__**umn** within-column proportions
**row** within-row proportions
**se** standard errors
**ci** confidence intervals
**deff** display the DEFF design effects
**deft** display the DEFT design effects
**cv** display the coefficient of variation
__srs__**subpop** report design effects assuming SRS within
subpopulation
**obs** cell observations
-------------------------------------------------------------------------
When any of **se**, **ci**, **deff**, **deft**, **cv**, or **srssubpop** is specified, only one
of **cell**, **count**, **column**, or **row** can be specified. If none of **se**, **ci**,
**deff**, **deft**, **cv**, or **srssubpop** is specified, any of or all **cell**, **count**,
**column**, and **row** can be specified.

*display_options* Description
-------------------------------------------------------------------------
Reporting
__l__**evel(***#***)** set confidence level; default is **level(95)**
__prop__**ortion** display proportions; the default
__per__**cent** display percentages instead of proportions
__vert__**ical** stack confidence interval endpoints vertically
__nomarg__**inals** suppress row and column marginals
__nolab__**el** suppress displaying value labels
__notab__**le** suppress displaying the table
__cellw__**idth(***#***)** cell width
__csepw__**idth(***#***)** column-separation width
__stubw__**idth(***#***)** stub width
__for__**mat(***%fmt***)** cell format; default is **format(%6.0g)**
-------------------------------------------------------------------------
**proportion** and **notable** are not shown in the dialog box.

*statistic_options* Description
-------------------------------------------------------------------------
Test statistics
__pea__**rson** Pearson's chi-squared
**lr** likelihood ratio
__nul__**l** display null-based statistics
**wald** adjusted Wald
**llwald** adjusted log-linear Wald
__noadj__**ust** report unadjusted Wald statistics
-------------------------------------------------------------------------

__Menu__

**Statistics > Survey data analysis > Tables > Two-way tables**

__Description__

**svy: tabulate** produces two-way tabulations with tests of independence for
complex survey data. See **[SVY] svy: tabulate oneway** for one-way
tabulations for complex survey data.

__Options__

*svy_options*; see **[SVY] svy**.

+-------+
----+ Model +------------------------------------------------------------

**stdize(***varname***)** specifies that the point estimates be adjusted by direct
standardization across the strata identified by *varname*. This option
requires the **stdweight()** option.

**stdweight(***varname***)** specifies the weight variable associated with the
standard strata identified in the **stdize()** option. The
standardization weights must be constant within the standard strata.

**tab(***varname***)** specifies that counts be cell totals of this variable and
proportions (or percentages) be relative to (that is, weighted by)
this variable. For example, if this variable denotes income, the
cell "counts" are instead totals of income for each cell, and the
cell proportions are proportions of income for each cell.

**missing** specifies that missing values of *varname1* and *varname2* be treated
as another row or column category rather than be omitted from the
analysis (the default).

+-------------+
----+ Table items +------------------------------------------------------

**cell** requests that cell proportions (or percentages) be displayed. This
is the default if none of **count**, **row**, or **column** are specified.

**count** requests that weighted cell counts be displayed.

**column** or **row** requests that column or row proportions (or percentages) be
displayed.

**se** requests that the standard errors of cell proportions (the default),
weighted counts, or row or column proportions be displayed. When **se**
(or **ci**, **deff**, **deft**, or **cv**) is specified, only one of **cell**, **count**,
**row**, or **column** can be selected. The standard error computed is the
standard error of the one selected.

**ci** requests confidence intervals for cell proportions, weighted counts,
or row or column proportions. The confidence intervals are
constructed using a logit transform so that their endpoints always
lie between 0 and 1.

**deff** and **deft** request that the design-effect measures DEFF and DEFT be
displayed for each cell proportion, count, or row or column
proportion. See **[SVY] estat** for details. The mean generalized DEFF
is also displayed when **deff**, **deft**, or **subpop** is requested; see
*Methods and formulas* in **[SVY] svy: tabulate twoway** for an
explanation.

The **deff** and **deft** options are not allowed with estimation results
that used direct standardization of poststratification.

**cv** requests that the coefficient of variation be displayed for each cell
proportion, count, or row or column proportion. See **[SVY] estat** for
details.

**srssubpop** requests that DEFF and DEFT be computed using an estimate of
SRS (simple random sampling) variance for sampling within a
subpopulation. By default, DEFF and DEFT are computed using an
estimate of the SRS variance for sampling from the entire population.
Typically, **srssubpop** would be given when computing subpopulation
estimates by strata or by groups of strata.

**obs** requests that the number of observations for each cell be displayed.

+-----------+
----+ Reporting +--------------------------------------------------------

**level(***#***)** specifies the confidence level, as a percentage, for confidence
intervals. The default is **level(95)** or as set by **set level**.

**proportion**, the default, requests that proportions be displayed.

**percent** requests that percentages be displayed instead of proportions.

**vertical** requests that the endpoints of the confidence intervals be
stacked vertically on display.

**nomarginals** requests that row and column marginals not be displayed.

**nolabel** requests that variable labels and value labels be ignored.

**notable** prevents the header and table from being displayed in the output.
When specified, only the results of the requested test statistics are
displayed. This option may not be specified with any other option in
*display_options* except the **level()** option.

**cellwidth(***#***)**, **csepwidth(***#***)**, and **stubwidth(***#***)** specify widths of table
elements in the output; see **[P] tabdisp**. Acceptable values for the
**stubwidth()** option range from 4 to 32.

**format(***%fmt***)** specifies a format for the items in the table. The default
is **format(%6.0g)**. See **[U] 12.5 Formats: Controlling how data are**
**displayed**.

+-----------------+
----+ Test statistics +--------------------------------------------------

**pearson** requests that the Pearson chi-squared statistic be computed. By
default, this is the test of independence that is displayed. The
Pearson chi-squared statistic is corrected for the survey design with
the second-order correction of Rao and Scott (1984) and is converted
into an F statistic. One term in the correction formula can be
calculated using either observed cell proportions or proportions
under the null hypothesis (that is, the product of the marginals).
By default, observed cell proportions are used. If the **null** option
is selected, then a statistic corrected using proportions under the
null hypothesis is displayed as well.

**lr** requests that the likelihood-ratio test statistic for proportions be
computed. This statistic is not defined when there are one or more
zero cells in the table. The statistic is corrected for the survey
design by using the same correction procedure that is used with the
**pearson** statistic. Again either observed cell proportions or
proportions under the null hypothesis can be used in the correction
formula. By default, the former is used; specifying the **null** option
gives both the former and the latter. Neither variant of this
statistic is recommended for sparse tables. For nonsparse tables,
the **lr** statistics are similar to the corresponding **pearson**
statistics.

**null** modifies the **pearson** and **lr** options only. If **null** is specified, two
corrected statistics are displayed. The statistic labeled "D-B
(null)" ("D-B" stands for design-based) uses proportions under the
null hypothesis (that is, the product of the marginals) in the Rao
and Scott (1984) correction. The statistic labeled merely
"Design-based" uses observed cell proportions. If **null** is not
specified, only the correction that uses observed proportions is
displayed.

**wald** requests a Wald test of whether observed weighted counts equal the
product of the marginals (Koch, Freeman, and Freeman 1975). By
default, an adjusted F statistic is produced; an unadjusted statistic
can be produced by specifying **noadjust**. The unadjusted F statistic
can yield extremely anticonservative p-values (that is, p-values that
are too small) when the degrees of freedom of the variance estimates
(the number of sampled PSUs minus the number of strata) are small
relative to the (R-1)(C-1) degrees of freedom of the table (where R
is the number of rows and C is the number of columns). Hence, the
statistic produced by **wald** and **noadjust** should not be used for
inference unless it is essentially identical to the adjusted
statistic.

This option must be specified at run time in order to be used on
subsequent calls to **svy** to report results.

**llwald** requests a Wald test of the log-linear model of independence
(Koch, Freeman, and Freeman 1975). The statistic is not defined when
there are one or more zero cells in the table. The adjusted
statistic (the default) can produce anticonservative p-values,
especially for sparse tables, when the degrees of freedom of the
variance estimates are small relative to the degrees of freedom of
the table. Specifying **noadjust** yields a statistic with more severe
problems. Neither the adjusted nor the unadjusted statistic is
recommended for inference; the statistics are made available only for
pedagogical purposes.

**noadjust** modifies the **wald** and **llwald** options only. It requests that an
unadjusted F statistic be displayed in addition to the adjusted
statistic.

**svy:** **tabulate** uses the **tabdisp** command (see **[P] tabdisp**) to produce the
table. Only five items can be displayed in the table at one time. The **ci**
option implies two items. If too many items are selected, a warning will
appear immediately. To view more items, redisplay the table while
specifying different options.

__Examples__

**. webuse nhanes2b**
**. svy: tabulate race diabetes**
**. svy: tabulate, row**
**. svy: tabulate race diabetes, row se ci format(%7.4f)**

**. webuse svy_tabopt**
**. svyset psuid [pweight=finalwgt], strata(stratid)**
**. svy: tabulate gender race, tab(income) row**

**. webuse nhanes2b**
**. gen male = (sex==1) if !missing(sex)**
**. svy, subpop(male): tabulate highbp sizplace, col obs pearson lr null**
**wald**

__Stored results__

In addition to the results documented in **[SVY] svy**, **svy: tabulate** stores
the following in **e()**:

Scalars
**e(r)** number of rows
**e(c)** number of columns
**e(cvgdeff)** coefficient of variation of generalized DEFF eigenvalues
**e(mgdeff)** mean generalized DEFF
**e(total)** weighted sum of **tab()** variable

**e(F_Pear)** default-corrected Pearson F
**e(F_Penl)** null-corrected Pearson F
**e(df1_Pear)** numerator d.f. for **e(F_Pear)**
**e(df2_Pear)** denominator d.f. for **e(F_Pear)**
**e(df1_Penl)** numerator d.f. for **e(F_Penl)**
**e(df2_Penl)** denominator d.f. for **e(F_Penl)**
**e(p_Pear)** p-value for **e(F_Pear)**
**e(p_Penl)** p-value for **e(F_Penl)**
**e(cun_Pear)** uncorrected Pearson chi-squared
**e(cun_Penl)** null variant uncorrected Pearson chi-squared

**e(F_LR)** default-corrected likelihood-ratio F
**e(F_LRnl)** null-corrected likelihood-ratio F
**e(df1_LR)** numerator d.f. for **e(F_LR)**
**e(df2_LR)** denominator d.f. for **e(F_LR)**
**e(df1_LRnl)** numerator d.f. for **e(F_LRnl)**
**e(df2_LRnl)** denominator d.f. for **e(F_LRnl)**
**e(p_LR)** p-value for **e(F_LR)**
**e(p_LRnl)** p-value for **e(F_LRnl)**
**e(cun_LR)** uncorrected likelihood-ratio chi-squared
**e(cun_LRnl)** null variant uncorrected likelihood-ratio chi-squared

**e(F_Wald)** adjusted "Pearson" Wald F
**e(F_LLW)** adjusted log-linear Wald F
**e(p_Wald)** p-value for **e(F_Wald)**
**e(p_LLW)** p-value for **e(F_LLW)**
**e(Fun_Wald)** unadjusted "Pearson" Wald F
**e(Fun_LLW)** unadjusted log-linear Wald F
**e(pun_Wald)** p-value for **e(Fun_Wald)**
**e(pun_LLW)** p-value for **e(Fun_LLW)**
**e(cun_Wald)** unadjusted "Pearson" Wald chi-squared
**e(cun_LLW)** unadjusted log-linear Wald chi-squared

Macros
**e(cmd)** **tabulate**
**e(tab)** **tab()** variable
**e(rowlab)** label or empty
**e(collab)** label or empty
**e(rowvlab)** row variable label
**e(colvlab)** column variable label
**e(rowvar)** *varname*1, the row variable
**e(colvar)** *varname*2, the column variable
**e(setype)** **cell**, **count**, **column**, or **row**

Matrices
**e(Prop)** matrix of cell proportions
**e(Obs)** matrix of observation counts
**e(Deff)** DEFF vector for **e(setype)** items
**e(Deft)** DEFT vector for **e(setype)** items
**e(Row)** values for row variable
**e(Col)** values for column variable
**e(V_row)** variance for row totals
**e(V_col)** variance for column totals
**e(V_srs_row)** V_srs for row totals
**e(V_srs_col)** V_srs for column totals
**e(Deff_row)** DEFF for row totals
**e(Deff_col)** DEFF for column totals
**e(Deft_row)** DEFT for row totals
**e(Deft_col)** DEFT for column totals

__References__

Koch, G. G., D. H. Freeman Jr., and J. L. Freeman. 1975. Strategies in
the multivariate analysis of data from complex surveys.
*International Statistical Review* 43: 59-78.

Rao, J. N. K., and A. J. Scott. 1984. On chi-squared tests for multiway
contingency tables with cell proportions estimated from survey data.
*Annals of Statistics* 12: 46-60.