Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: The linguistics of -tabulate var1 var2, row-


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: The linguistics of -tabulate var1 var2, row-
Date   Mon, 22 Mar 2004 10:33:19 -0000

I can't see a strong case for disturbing long-standing 
code here. 

In essence, when you ask for -row- the row percents 
add up to 100 and the row labelled "Total" contains percents 
derived from column totals. And that seems 
consistent with established tabulation traditions. 

I agree that for specific purposes one might want 
different wording, but sensible wording is hard 
to automate, and far from "All" being a satisfactory 
general alternative, it seems no less ambiguous to me. 

I suggest that this kind of personal preference problem is best 
tackled downstream, say by automating an edit of .smcl output 
from a series of tabulations. Or you could look at -tabdisp- 
and its options. 

Nick 
[email protected] 

Renzo Comolli

> This is a minor linguistic point that I bring up only because 
> it came up so
> many time when people are reading my tables.
> The question is conceptually so obvious that it is even 
> difficult to explain
> where the problem arises.
> When you create a two-ways tabulation and you display it as 
> percentage of
> rows, of course it does not sum up to 100% by column, 
> nonetheless it might
> be human mental laziness that one someone sees the word 
> "Total" than he or
> she wants to see it sum up to 100%
> See the example below in which the last line says "Total"
> �
> . use http://www.stata-press.com/data/r8/gss1991.dta
> . tab region sex, row nofreq
> �
> �region of |
> the united |�� respondent's sex
> ��� states |����� male���� female |���� Total
> -----------+----------------------+----------
> north east |���� 41.60����� 58.40 |��� 100.00 
> south east |���� 42.82����� 57.18 |��� 100.00 
> ����� west |���� 42.14����� 57.86 |��� 100.00 
> -----------+----------------------+----------
> ���� Total |���� 42.09 �����57.91 |��� 100.00
> �
> Perhaps it would be better if -tabulate- said "All" in the 
> last row when the
> option -,row- is selected.
> e.g. 
> �
> region of |
> the united |�� respondent's sex
> ��� states |����� male���� female |���� Total
> -----------+----------------------+----------
> north east |���� 41.60����� 58.40 |��� 100.00 
> south east |���� 42.82����� 57.18 |��� 100.00 
> ����� west |���� 42.14����� 57.86 |��� 100.00 
> -----------+----------------------+----------
> ���� ��All |���� 42.09 �����57.91 |��� 100.00
> �
> This point is very minor and it out more as a curiosum about 
> peoples' mind
> than a suggestion for a change in the code.
> Personally, I found useful to put my own description so I 
> would write "U.S."
> in this example
> 
> 
> Of course the same could be said for -tabulate var1 var2, 
> column- but for
> inscrutable reasons having to do with the human mind, people 
> are not that
> confused -, column- case 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index