Sometimes, we would like to test for the association of two categorical variables, but we do not have the raw data. We can use a summary table to calculate many tests of association using Stata's tabi command.
The command is followed by the cell counts in the first row with each column separated by a space. We begin a new row using the backslash, \ character followed by the cell counts in the second row, and so forth.
The example below creates a 3x2 table along with the marginal counts and computes a Pearson chi-squared statistic by default.
. tabi 280 14 \ 150 16 \ 59 14
col | ||||
row | 1 2 | Total | ||
1 | 280 14 | 294 | ||
2 | 150 16 | 166 | ||
3 | 59 14 | 73 | ||
Total | 489 44 | 533 |
The output displays the table, the two degrees-of-freedom Pearson's chi-squared statistic (16.66) and the p-value (0.000).
There are many options we can use to display other statistics and cell contents. For example, Pearson's chi-squared statistic is calculated using the expected count in each cell. We could calculate the expected cell count for cell [1,1] by multiplying the first row total (294) by the first column total (489) and dividing the table total (533).
. display (294*489)/533 269.72983
Or we could use the expected option to calculate the expected cell count for each cell.
. tabi 280 14 \ 150 16 \ 59 14, expected
Key | frequency | expected frequency |
col | ||||
row | 1 2 | Total | ||
1 | 280 14 | 294 | ||
269.7 24.3 | 294.0 | |||
2 | 150 16 | 166 | ||
152.3 13.7 | 166.0 | |||
3 | 59 14 | 73 | ||
67.0 6.0 | 73.0 | |||
Total | 489 44 | 533 | ||
489.0 44.0 | 533.0 |
The observed and expected cell counts can be used to calculate the contribution of each cell to the chi-squared statistic. The calculation for cell [1,1] is
. display ((280 - 269.7)^2)/269.7 .393363
Or we could use the cchi2 option to calculate the contribution of each cell and use the chi2 option to calculate the total chi-squared statistic.
. tabi 280 14 \ 150 16 \ 59 14, expected cchi2 chi2
Key | frequency | expected frequency | chi2 contribution |
col | ||||
row | 1 2 | Total | ||
1 | 280 14 | 294 | ||
269.7 24.3 | 294.0 | |||
0.0 0.4 | 4.7 | |||
2 | 150 16 | 166 | ||
152.3 13.7 | 166.0 | |||
0.0 0.4 | 0.4 | |||
3 | 59 14 | 73 | ||
67.0 6.0 | 73.0 | |||
0.9 10.6 | 11.5 | |||
Total | 489 44 | 533 | ||
489.0 44.0 | 533.0 | |||
1.4 15.3 | 16.7 |
There are options for computing other tests and measures of association such as Fisher's exact test (exact), the likelihood-ratio chi-squared (lrchi2), Goodman and Kruskal's gamma (gamma), Kendall's tau-b (taub), and Cramér's V (V).
. tabi 280 14 \ 150 16 \ 59 14, exact gamma lrchi2 taub V Enumerating sample-space combinations: stage 3: enumerations = 1 stage 2: enumerations = 16 stage 1: enumerations = 0
col | ||||
row | 1 2 | Total | ||
1 | 280 14 | 294 | ||
2 | 150 16 | 166 | ||
3 | 59 14 | 73 | ||
Total | 489 44 | 533 |
You can also report relative frequencies within each cell,
. tabi 280 14 \ 150 16 \ 59 14, cell
Key | frequency | cell percentage |
col | ||||
row | 1 2 | Total | ||
1 | 280 14 | 294 | ||
52.53 2.63 | 55.16 | |||
2 | 150 16 | 166 | ||
28.14 3.00 | 31.14 | |||
3 | 59 14 | 73 | ||
11.07 2.63 | 13.70 | |||
Total | 489 44 | 533 | ||
91.74 8.26 | 100.00 |
within-column relative frequencies,
. tabi 280 14 \ 150 16 \ 59 14, column
Key | frequency | column percentage |
col | ||||
row | 1 2 | Total | ||
1 | 280 14 | 294 | ||
57.26 31.82 | 55.16 | |||
2 | 150 16 | 166 | ||
30.67 36.36 | 31.14 | |||
3 | 59 14 | 73 | ||
12.07 31.82 | 13.70 | |||
Total | 489 44 | 533 | ||
100.00 100.00 | 100.00 |
or within-row relative frequencies,
. tabi 280 14 \ 150 16 \ 59 14, row
Key | frequency | row percentage |
col | ||||
row | 1 2 | Total | ||
1 | 280 14 | 294 | ||
95.24 4.76 | 100.00 | |||
2 | 150 16 | 166 | ||
90.36 9.64 | 100.00 | |||
3 | 59 14 | 73 | ||
80.82 19.18 | 100.00 | |||
Total | 489 44 | 533 | ||
91.74 8.26 | 100.00 |
You can watch a demonstration of these commands by clicking on the link to the YouTube video below. You can read more about these commands by clicking on the links to the Stata manual entries below.
Read more in the Stata Base Reference Manual; see [R] tabulate twoway .