Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: The linguistics of -tabulate var1 var2, row-


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: The linguistics of -tabulate var1 var2, row-
Date   Mon, 22 Mar 2004 10:33:19 -0000

I can't see a strong case for disturbing long-standing 
code here. 

In essence, when you ask for -row- the row percents 
add up to 100 and the row labelled "Total" contains percents 
derived from column totals. And that seems 
consistent with established tabulation traditions. 

I agree that for specific purposes one might want 
different wording, but sensible wording is hard 
to automate, and far from "All" being a satisfactory 
general alternative, it seems no less ambiguous to me. 

I suggest that this kind of personal preference problem is best 
tackled downstream, say by automating an edit of .smcl output 
from a series of tabulations. Or you could look at -tabdisp- 
and its options. 

Nick 
n.j.cox@durham.ac.uk 

Renzo Comolli

> This is a minor linguistic point that I bring up only because 
> it came up so
> many time when people are reading my tables.
> The question is conceptually so obvious that it is even 
> difficult to explain
> where the problem arises.
> When you create a two-ways tabulation and you display it as 
> percentage of
> rows, of course it does not sum up to 100% by column, 
> nonetheless it might
> be human mental laziness that one someone sees the word 
> "Total" than he or
> she wants to see it sum up to 100%
> See the example below in which the last line says "Total"
>  
> . use http://www.stata-press.com/data/r8/gss1991.dta
> . tab region sex, row nofreq
>  
>  region of |
> the united |   respondent's sex
>     states |      male     female |     Total
> -----------+----------------------+----------
> north east |     41.60      58.40 |    100.00 
> south east |     42.82      57.18 |    100.00 
>       west |     42.14      57.86 |    100.00 
> -----------+----------------------+----------
>      Total |     42.09      57.91 |    100.00
>  
> Perhaps it would be better if -tabulate- said "All" in the 
> last row when the
> option -,row- is selected.
> e.g. 
>  
> region of |
> the united |   respondent's sex
>     states |      male     female |     Total
> -----------+----------------------+----------
> north east |     41.60      58.40 |    100.00 
> south east |     42.82      57.18 |    100.00 
>       west |     42.14      57.86 |    100.00 
> -----------+----------------------+----------
>        All |     42.09      57.91 |    100.00
>  
> This point is very minor and it out more as a curiosum about 
> peoples' mind
> than a suggestion for a change in the code.
> Personally, I found useful to put my own description so I 
> would write "U.S."
> in this example
> 
> 
> Of course the same could be said for -tabulate var1 var2, 
> column- but for
> inscrutable reasons having to do with the human mind, people 
> are not that
> confused -, column- case 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index