help svy: tabulate twoway dialog: svy: tabulate twoway
also see: svy postestimation
svy: tabulate oneway
-------------------------------------------------------------------------------
Title
[SVY] svy: tabulate twoway -- Two-way tables for survey data
Syntax
Basic syntax
svy: tabulate varname1 varname2
Full syntax
svy [vcetype] [, svy_options] : tabulate varname1 varname2 [if] [in]
[, tabulate_options display_items display_options
statistic_options]
Syntax to replay results
svy [, display_items display_options statistic_options]
vcetype description
-------------------------------------------------------------------------
SE
linearized Taylor linearized variance estimation
brr BRR variance estimation; see [SVY] svy brr
jackknife jackknife variance estimation; see [SVY] svy
jackknife
-------------------------------------------------------------------------
Specifying a vcetype overrides the default from svyset.
svy_options description
-------------------------------------------------------------------------
if/in
subpop([varname] [if]) identify a subpopulation
SE
brr_options more options allowed with BRR variance
estimation
jackknife_options more options allowed with jackknife variance
estimation
-------------------------------------------------------------------------
svy requires that the survey design variables be identified using svyset;
see [SVY] svyset.
See [SVY] svy postestimation for features available after estimation.
Warning: using if or in restrictions will often not produce correct
variance estimates for subpopulations. To compute estimates for
subpopulations, use the subpop() option.
tabulate_options description
-------------------------------------------------------------------------
Model
stdize(varname) variable identifying strata for standardization
stdweight(varname) weight variable for standardization
tab(varname) variable for which to compute cell
totals/proportions
missing treat missing values like other values
-------------------------------------------------------------------------
display_items description
-------------------------------------------------------------------------
Table items
cell cell proportions
count weighted cell counts
column within-column proportions
row within-row proportions
se standard errors
ci confidence intervals
deff display the DEFF design effects
deft display the DEFT design effects
srssubpop report design effects assuming SRS within
subpopulation
obs cell observations
-------------------------------------------------------------------------
When any of se, ci, deff, deft, or srssubpop is specified, only of one
cell, count, column, or row can be specified. If none of se, ci, deff,
deft, or srssubpop is specified, any or all of cell, count, column, and
row can be specified.
display_options description
-------------------------------------------------------------------------
Reporting
level(#) set confidence level; default is level(95)
+ proportion display proportions; the default
percent display percentages instead of proportions
vertical stack confidence interval endpoints vertically
nomarginals suppress row and column marginals
nolabel suppress displaying value labels
+ notable suppress displaying the table
cellwidth(#) cell width
csepwidth(#) column-separation width
stubwidth(#) stub width
format(%fmt) cell format; default is format(%6.0g)
-------------------------------------------------------------------------
+ proportion and notable are not shown in the dialog box.
statistic_options description
-------------------------------------------------------------------------
Test statistics
pearson Pearson's chi-squared
lr likelihood ratio
null display null-based statistics
wald adjusted Wald
llwald adjusted log-linear Wald
noadjust report unadjusted Wald statistics
-------------------------------------------------------------------------
Menu
Statistics > Survey data analysis > Tables > Two-way tables
Description
svy: tabulate produces two-way tabulations with tests of independence for
complex survey data. See [SVY] svy: tabulate oneway for one-way
tabulations for complex survey data.
Options
svy_options; see [SVY] svy.
+-------+
----+ Model +------------------------------------------------------------
stdize(varname) specifies that the point estimates be adjusted by direct
standardization across the strata identified by varname. This option
requires the stdweight() option.
stdweight(varname) specifies the weight variable associated with the
standard strata identified in the stdize() option. The
standardization weights must be constant within the standard strata.
tab(varname) specifies that counts be cell totals of this variable and
proportions (or percentages) should be relative to (i.e., weighted
by) this variable. For example, if this variable denotes income,
then the cell "counts" are instead totals of income for each cell,
and the cell proportions are proportions of income for each cell.
missing specifies that missing values of varname1 and varname2 be treated
as another row or column category rather than be omitted from the
analysis (the default).
+-------------+
----+ Table items +------------------------------------------------------
cell requests that cell proportions (or percentages) be displayed. This
is the default if none of count, row, or column are specified.
count requests that weighted cell counts be displayed.
column or row requests that column or row proportions (or percentages) be
displayed.
se requests that the standard errors of cell proportions (the default),
weighted counts, or row or column proportions be displayed. When se
(or ci, deff, or deft) is specified, only one of cell, count, row, or
column can be selected. The standard error computed is the standard
error of the one selected.
ci requests confidence intervals for cell proportions, weighted counts,
or row or column proportions. The confidence intervals are
constructed using a logit transform so that their endpoints always
lie between 0 and 1.
deff and deft request that the design-effect measures DEFF and DEFT be
displayed for each cell proportion, count, or row or column
proportion. See [SVY] estat for details.
The deff and deft options are not allowed with estimation results
that used direct standardization of poststratification.
srssubpop requests that DEFF and DEFT be computed using an estimate of
SRS (simple random sampling) variance for sampling within a
subpopulation. By default, DEFF and DEFT are computed using an
estimate of the SRS variance for sampling from the entire population.
Typically, srssubpop would be given when computing subpopulation
estimates by strata or by groups of strata.
obs requests that the number of observations for each cell be displayed.
+-----------+
----+ Reporting +--------------------------------------------------------
level(#) specifies the confidence level, as a percentage, for confidence
intervals. The default is level(95) or as set by set level.
proportion, the default, requests that proportions be displayed.
percent requests that percentages be displayed instead of proportions.
vertical requests that the endpoints of the confidence intervals be
stacked vertically on display.
nomarginals requests that row and column marginals not be displayed.
nolabel requests that variable labels and value labels be ignored.
notable prevents the header and table from being displayed in the output.
When specified, only the results of the requested test statistics are
displayed. This option may not be specified with any other option in
display_options except the level() option.
cellwidth(#), csepwidth(#), and stubwidth(#) specify widths of table
elements in the output; see [P] tabdisp. Acceptable values for the
stubwidth() option range from 4 to 32.
format(%fmt) specifies a format for the items in the table. The default
is format(%6.0g). See [U] 12.5 Formats: Controlling how data are
displayed.
+-----------------+
----+ Test statistics +--------------------------------------------------
pearson requests that the Pearson chi-squared statistic be computed. By
default, this is the test of independence that is displayed. The
Pearson chi-squared statistic is corrected for the survey design with
the second-order correction of Rao and Scott (1984) and is converted
into an F statistic. One term in the correction formula can be
calculated using either observed cell proportions or proportions
under the null hypothesis (i.e., the product of the marginals). By
default, observed cell proportions are used. If the null option is
selected, then a statistic corrected using proportions under the null
hypothesis is displayed as well.
lr requests that the likelihood-ratio test statistic for proportions be
computed. This statistic is not defined when there are one or more
zero cells in the table. The statistic is corrected for the survey
design by using the same correction procedure that is used with the
pearson statistic. Again either observed cell proportions or
proportions under the null can be used in the correction formula. By
default, the former is used; specifying the null option gives both
the former and the latter. Neither variant of this statistic is
recommended for sparse tables. For nonsparse tables, the lr
statistics are very similar to the corresponding pearson statistics.
null modifies the pearson and lr options only. If null is specified, two
corrected statistics are displayed. The statistic labeled "D-B
(null)" ("D-B" stands for design-based) uses proportions under the
null hypothesis (i.e., the product of the marginals) in the Rao and
Scott (1984) correction. The statistic labeled merely "Design-based"
uses observed cell proportions. If null is not specified, only the
correction that uses observed proportions is displayed.
wald requests a Wald test of whether observed weighted counts equal the
product of the marginals (Koch, Freeman, and Freeman 1975). By
default, an adjusted F statistic is produced; an unadjusted statistic
can be produced by specifying noadjust. The unadjusted F statistic
can yield extremely anticonservative p-values (i.e., p-values that
are too small) when the degrees of freedom of the variance estimates
(the number of PSUs minus the number of strata) are small relative to
the (R-1)(C-1) degrees of freedom of the table (where R is the number
of rows and C is the number of columns). Hence, the statistic
produced by wald and noadjust should not be used for inference unless
it is essentially identical to the adjusted statistic.
This option must be specified at run time in order to be used on
subsequent calls to svy to report results.
llwald requests a Wald test of the log-linear model of independence
(Koch, Freeman, and Freeman 1975). The statistic is not defined when
there are one or more zero cells in the table. The adjusted
statistic (the default) can produce anti-conservative p-values,
especially for sparse tables, when the degrees of freedom of the
variance estimates are small relative to the degrees of freedom of
the table. Specifying noadjust yields a statistic with more severe
problems. Neither the adjusted nor the unadjusted statistic is
recommended for inference; the statistics are made available only for
pedagogical purposes.
noadjust modifies the wald and llwald options only. It requests that an
unadjusted F statistic be displayed in addition to the adjusted
statistic.
svy: tabulate uses the tabdisp command (see [P] tabdisp) to produce the
table. Only five items can be displayed in the table at one time. The ci
option implies two items. If too many items are selected, a warning will
appear immediately. To view more items, redisplay the table while
specifying different options.
Examples
. webuse nhanes2b
. svy: tabulate race diabetes
. svy: tabulate, row
. svy: tabulate race diabetes, row se ci format(%7.4f)
. webuse svy_tabopt, clear
. svyset psuid [pweight=finalwgt], strata(stratid)
. svy: tabulate gender race, tab(income) row
. webuse nhanes2b
. gen male = (sex==1) if !missing(sex)
. svy, subpop(male): tabulate highbp sizplace, col obs pearson lr null
wald
Saved results
In addition to the results documented in [SVY] svy, svy: tabulate saves
the following in e():
Scalars
e(r) number of rows
e(c) number of columns
e(cvgdeff) c.v. of generalized DEFF eigenvalues
e(mgdeff) mean generalized DEFF
e(total) weighted sum of tab() variable
e(F_Pear) default-corrected Pearson F
e(F_Penl) null-corrected Pearson F
e(df1_Pear) numerator d.f. for e(F_Pear)
e(df2_Pear) denominator d.f. for e(F_Pear)
e(df1_Penl) numerator d.f. for e(F_Penl)
e(df2_Penl) denominator d.f. for e(F_Penl)
e(p_Pear) p-value for e(F_Pear)
e(p_Penl) p-value for e(F_Penl)
e(cun_Pear) uncorrected Pearson chi-squared
e(cun_Penl) null variant uncorrected Pearson chi-squared
e(F_LR) default-corrected likelihood-ratio F
e(F_LRnl) null-corrected likelihood-ratio F
e(df1_LR) numerator d.f. for e(F_LR)
e(df2_LR) denominator d.f. for e(F_LR)
e(df1_LRnl) numerator d.f. for e(F_LRnl)
e(df2_LRnl) denominator d.f. for e(F_LRnl)
e(p_LR) p-value for e(F_LR)
e(p_LRnl) p-value for e(F_LRnl)
e(cun_LR) uncorrected likelihood-ratio chi-squared
e(cun_LRnl) null variant uncorrected likelihood-ratio chi-squared
e(F_Wald) adjusted "Pearson" Wald F
e(F_LLW) adjusted log-linear Wald F
e(p_Wald) p-value for e(F_Wald)
e(p_LLW) p-value for e(F_LLW)
e(Fun_Wald) unadjusted "Pearson" Wald F
e(Fun_LLW) unadjusted log-linear Wald F
e(pun_Wald) p-value for e(Fun_Wald)
e(pun_LLW) p-value for e(Fun_LLW)
e(cun_Wald) unadjusted "Pearson" Wald chi-squared
e(cun_LLW) unadjusted log-linear Wald chi-squared
Macros
e(cmd) tabulate
e(tab) tab() variable
e(rowlab) label or empty
e(collab) label or empty
e(rowvlab) row variable label
e(colvlab) column variable label
e(rowvar) varname1, the row variable
e(colvar) varname2, the column variable
e(setype) cell, count, column, or row
Matrices
e(Prop) matrix of cell proportions
e(Obs) matrix of observation counts
e(Deff) DEFF vector for e(setype) items
e(Deft) DEFT vector for e(setype) items
e(Row) values for row variable
e(Col) values for column variable
e(V_row) variance for row totals
e(V_col) variance for column totals
e(V_srs_row) V_srs for row totals
e(V_srs_col) V_srs for column totals
e(Deff_row) DEFF for row totals
e(Deff_col) DEFF for column totals
e(Deft_row) DEFT for row totals
e(Deft_col) DEFT for column totals
References
Koch, G. G., D. H. Freeman Jr., and J. L. Freeman. 1975. Strategies in
the multivariate analysis of data from complex surveys.
International Statistical Review 43: 59-78.
Rao, J. N. K., and A. J. Scott. 1984. On chi-squared tests for multiway
contingency tables with cell proportions estimated from survey data.
Annals of Statistics 12: 46-60.
Also see
Manual: [SVY] svy: tabulate twoway
Help: [SVY] svy postestimation;
[SVY] svy: tabulate oneway, [SVY] svydescribe, [R] tabulate
twoway, [R] test