Stata 15 help for svy tabulate twoway


[SVY] svy: tabulate twoway -- Two-way tables for survey data


Basic syntax

svy: tabulate varname1 varname2

Full syntax

svy [vcetype] [, svy_options] : tabulate varname1 varname2 [if] [in] [, tabulate_options display_items display_options statistic_options]

Syntax to report results

svy [, display_items display_options statistic_options]

vcetype Description ------------------------------------------------------------------------- SE linearized Taylor-linearized variance estimation bootstrap bootstrap variance estimation; see [SVY] svy bootstrap brr BRR variance estimation; see [SVY] svy brr jackknife jackknife variance estimation; see [SVY] svy jackknife sdr SDR variance estimation; see [SVY] svy sdr ------------------------------------------------------------------------- Specifying a vcetype overrides the default from svyset.

svy_options Description ------------------------------------------------------------------------- if/in subpop([varname] [if]) identify a subpopulation

SE bootstrap_options more options allowed with bootstrap variance estimation brr_options more options allowed with BRR variance estimation jackknife_options more options allowed with jackknife variance estimation sdr_options more options allowed with SDR variance estimation ------------------------------------------------------------------------- svy requires that the survey design variables be identified using svyset; see [SVY] svyset. See [SVY] svy postestimation for features available after estimation. Warning: Using if or in restrictions will often not produce correct variance estimates for subpopulations. To compute estimates for subpopulations, use the subpop() option.

tabulate_options Description ------------------------------------------------------------------------- Model stdize(varname) variable identifying strata for standardization stdweight(varname) weight variable for standardization tab(varname) variable for which to compute cell totals/proportions missing treat missing values like other values -------------------------------------------------------------------------

display_items Description ------------------------------------------------------------------------- Table items cell cell proportions count weighted cell counts column within-column proportions row within-row proportions se standard errors ci confidence intervals deff display the DEFF design effects deft display the DEFT design effects cv display the coefficient of variation srssubpop report design effects assuming SRS within subpopulation obs cell observations ------------------------------------------------------------------------- When any of se, ci, deff, deft, cv, or srssubpop is specified, only one of cell, count, column, or row can be specified. If none of se, ci, deff, deft, cv, or srssubpop is specified, any of or all cell, count, column, and row can be specified.

display_options Description ------------------------------------------------------------------------- Reporting level(#) set confidence level; default is level(95) proportion display proportions; the default percent display percentages instead of proportions vertical stack confidence interval endpoints vertically nomarginals suppress row and column marginals nolabel suppress displaying value labels notable suppress displaying the table cellwidth(#) cell width csepwidth(#) column-separation width stubwidth(#) stub width format(%fmt) cell format; default is format(%6.0g) ------------------------------------------------------------------------- proportion and notable are not shown in the dialog box.

statistic_options Description ------------------------------------------------------------------------- Test statistics pearson Pearson's chi-squared lr likelihood ratio null display null-based statistics wald adjusted Wald llwald adjusted log-linear Wald noadjust report unadjusted Wald statistics -------------------------------------------------------------------------


Statistics > Survey data analysis > Tables > Two-way tables


svy: tabulate produces two-way tabulations with tests of independence for complex survey data. See [SVY] svy: tabulate oneway for one-way tabulations for complex survey data.


svy_options; see [SVY] svy.

+-------+ ----+ Model +------------------------------------------------------------

stdize(varname) specifies that the point estimates be adjusted by direct standardization across the strata identified by varname. This option requires the stdweight() option.

stdweight(varname) specifies the weight variable associated with the standard strata identified in the stdize() option. The standardization weights must be constant within the standard strata.

tab(varname) specifies that counts be cell totals of this variable and proportions (or percentages) be relative to (that is, weighted by) this variable. For example, if this variable denotes income, the cell "counts" are instead totals of income for each cell, and the cell proportions are proportions of income for each cell.

missing specifies that missing values of varname1 and varname2 be treated as another row or column category rather than be omitted from the analysis (the default).

+-------------+ ----+ Table items +------------------------------------------------------

cell requests that cell proportions (or percentages) be displayed. This is the default if none of count, row, or column are specified.

count requests that weighted cell counts be displayed.

column or row requests that column or row proportions (or percentages) be displayed.

se requests that the standard errors of cell proportions (the default), weighted counts, or row or column proportions be displayed. When se (or ci, deff, deft, or cv) is specified, only one of cell, count, row, or column can be selected. The standard error computed is the standard error of the one selected.

ci requests confidence intervals for cell proportions, weighted counts, or row or column proportions. The confidence intervals are constructed using a logit transform so that their endpoints always lie between 0 and 1.

deff and deft request that the design-effect measures DEFF and DEFT be displayed for each cell proportion, count, or row or column proportion. See [SVY] estat for details. The mean generalized DEFF is also displayed when deff, deft, or subpop is requested; see Methods and formulas in [SVY] svy: tabulate twoway for an explanation.

The deff and deft options are not allowed with estimation results that used direct standardization of poststratification.

cv requests that the coefficient of variation be displayed for each cell proportion, count, or row or column proportion. See [SVY] estat for details.

srssubpop requests that DEFF and DEFT be computed using an estimate of SRS (simple random sampling) variance for sampling within a subpopulation. By default, DEFF and DEFT are computed using an estimate of the SRS variance for sampling from the entire population. Typically, srssubpop would be given when computing subpopulation estimates by strata or by groups of strata.

obs requests that the number of observations for each cell be displayed.

+-----------+ ----+ Reporting +--------------------------------------------------------

level(#) specifies the confidence level, as a percentage, for confidence intervals. The default is level(95) or as set by set level.

proportion, the default, requests that proportions be displayed.

percent requests that percentages be displayed instead of proportions.

vertical requests that the endpoints of the confidence intervals be stacked vertically on display.

nomarginals requests that row and column marginals not be displayed.

nolabel requests that variable labels and value labels be ignored.

notable prevents the header and table from being displayed in the output. When specified, only the results of the requested test statistics are displayed. This option may not be specified with any other option in display_options except the level() option.

cellwidth(#), csepwidth(#), and stubwidth(#) specify widths of table elements in the output; see [P] tabdisp. Acceptable values for the stubwidth() option range from 4 to 32.

format(%fmt) specifies a format for the items in the table. The default is format(%6.0g). See [U] 12.5 Formats: Controlling how data are displayed.

+-----------------+ ----+ Test statistics +--------------------------------------------------

pearson requests that the Pearson chi-squared statistic be computed. By default, this is the test of independence that is displayed. The Pearson chi-squared statistic is corrected for the survey design with the second-order correction of Rao and Scott (1984) and is converted into an F statistic. One term in the correction formula can be calculated using either observed cell proportions or proportions under the null hypothesis (that is, the product of the marginals). By default, observed cell proportions are used. If the null option is selected, then a statistic corrected using proportions under the null hypothesis is displayed as well.

lr requests that the likelihood-ratio test statistic for proportions be computed. This statistic is not defined when there are one or more zero cells in the table. The statistic is corrected for the survey design by using the same correction procedure that is used with the pearson statistic. Again either observed cell proportions or proportions under the null hypothesis can be used in the correction formula. By default, the former is used; specifying the null option gives both the former and the latter. Neither variant of this statistic is recommended for sparse tables. For nonsparse tables, the lr statistics are similar to the corresponding pearson statistics.

null modifies the pearson and lr options only. If null is specified, two corrected statistics are displayed. The statistic labeled "D-B (null)" ("D-B" stands for design-based) uses proportions under the null hypothesis (that is, the product of the marginals) in the Rao and Scott (1984) correction. The statistic labeled merely "Design-based" uses observed cell proportions. If null is not specified, only the correction that uses observed proportions is displayed.

wald requests a Wald test of whether observed weighted counts equal the product of the marginals (Koch, Freeman, and Freeman 1975). By default, an adjusted F statistic is produced; an unadjusted statistic can be produced by specifying noadjust. The unadjusted F statistic can yield extremely anticonservative p-values (that is, p-values that are too small) when the degrees of freedom of the variance estimates (the number of sampled PSUs minus the number of strata) are small relative to the (R-1)(C-1) degrees of freedom of the table (where R is the number of rows and C is the number of columns). Hence, the statistic produced by wald and noadjust should not be used for inference unless it is essentially identical to the adjusted statistic.

This option must be specified at run time in order to be used on subsequent calls to svy to report results.

llwald requests a Wald test of the log-linear model of independence (Koch, Freeman, and Freeman 1975). The statistic is not defined when there are one or more zero cells in the table. The adjusted statistic (the default) can produce anticonservative p-values, especially for sparse tables, when the degrees of freedom of the variance estimates are small relative to the degrees of freedom of the table. Specifying noadjust yields a statistic with more severe problems. Neither the adjusted nor the unadjusted statistic is recommended for inference; the statistics are made available only for pedagogical purposes.

noadjust modifies the wald and llwald options only. It requests that an unadjusted F statistic be displayed in addition to the adjusted statistic.

svy: tabulate uses the tabdisp command (see [P] tabdisp) to produce the table. Only five items can be displayed in the table at one time. The ci option implies two items. If too many items are selected, a warning will appear immediately. To view more items, redisplay the table while specifying different options.


. webuse nhanes2b . svy: tabulate race diabetes . svy: tabulate, row . svy: tabulate race diabetes, row se ci format(%7.4f)

. webuse svy_tabopt . svyset psuid [pweight=finalwgt], strata(stratid) . svy: tabulate gender race, tab(income) row

. webuse nhanes2b . gen male = (sex==1) if !missing(sex) . svy, subpop(male): tabulate highbp sizplace, col obs pearson lr null wald

Stored results

In addition to the results documented in [SVY] svy, svy: tabulate stores the following in e():

Scalars e(r) number of rows e(c) number of columns e(cvgdeff) coefficient of variation of generalized DEFF eigenvalues e(mgdeff) mean generalized DEFF e(total) weighted sum of tab() variable

e(F_Pear) default-corrected Pearson F e(F_Penl) null-corrected Pearson F e(df1_Pear) numerator d.f. for e(F_Pear) e(df2_Pear) denominator d.f. for e(F_Pear) e(df1_Penl) numerator d.f. for e(F_Penl) e(df2_Penl) denominator d.f. for e(F_Penl) e(p_Pear) p-value for e(F_Pear) e(p_Penl) p-value for e(F_Penl) e(cun_Pear) uncorrected Pearson chi-squared e(cun_Penl) null variant uncorrected Pearson chi-squared

e(F_LR) default-corrected likelihood-ratio F e(F_LRnl) null-corrected likelihood-ratio F e(df1_LR) numerator d.f. for e(F_LR) e(df2_LR) denominator d.f. for e(F_LR) e(df1_LRnl) numerator d.f. for e(F_LRnl) e(df2_LRnl) denominator d.f. for e(F_LRnl) e(p_LR) p-value for e(F_LR) e(p_LRnl) p-value for e(F_LRnl) e(cun_LR) uncorrected likelihood-ratio chi-squared e(cun_LRnl) null variant uncorrected likelihood-ratio chi-squared

e(F_Wald) adjusted "Pearson" Wald F e(F_LLW) adjusted log-linear Wald F e(p_Wald) p-value for e(F_Wald) e(p_LLW) p-value for e(F_LLW) e(Fun_Wald) unadjusted "Pearson" Wald F e(Fun_LLW) unadjusted log-linear Wald F e(pun_Wald) p-value for e(Fun_Wald) e(pun_LLW) p-value for e(Fun_LLW) e(cun_Wald) unadjusted "Pearson" Wald chi-squared e(cun_LLW) unadjusted log-linear Wald chi-squared

Macros e(cmd) tabulate e(tab) tab() variable e(rowlab) label or empty e(collab) label or empty e(rowvlab) row variable label e(colvlab) column variable label e(rowvar) varname1, the row variable e(colvar) varname2, the column variable e(setype) cell, count, column, or row

Matrices e(Prop) matrix of cell proportions e(Obs) matrix of observation counts e(Deff) DEFF vector for e(setype) items e(Deft) DEFT vector for e(setype) items e(Row) values for row variable e(Col) values for column variable e(V_row) variance for row totals e(V_col) variance for column totals e(V_srs_row) V_srs for row totals e(V_srs_col) V_srs for column totals e(Deff_row) DEFF for row totals e(Deff_col) DEFF for column totals e(Deft_row) DEFT for row totals e(Deft_col) DEFT for column totals


Koch, G. G., D. H. Freeman Jr., and J. L. Freeman. 1975. Strategies in the multivariate analysis of data from complex surveys. International Statistical Review 43: 59-78.

Rao, J. N. K., and A. J. Scott. 1984. On chi-squared tests for multiway contingency tables with cell proportions estimated from survey data. Annals of Statistics 12: 46-60.

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index