Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Wishlist Table command with standard error


From   n j cox <n.j.cox@durham.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Wishlist Table command with standard error
Date   Thu, 24 May 2007 14:35:55 +0100

Garry's request is reasonable, and joins the list of several
hundred (guess) equally reasonable requests of the same status.

It's a little trickier than you might guess. -table- is
really a front end for other programs that do most of the
work, in this case -collapse- and -tabdisp-. As I understand it,
you would need to hit -collapse- as well as -table-. In principle,
a competent user-programmer could clone these programs and make
the changes needed. So, in principle, no one need wait for
StataCorp to agree and then get round to doing it: it could
be done today. In practice, another way seems more desirable.
You would only need to change a few lines of code; it's finding
those among several hundred that is the issue.

Svend's reminder of -tabstat-'s flexibility points to one good
solution, but it won't extend so well to more complicated examples
than he gives, even two-way tables.

There is at least one other way that is much easier. Back-tracking, the problem has two parts, calculating summary measures and then tabulating
them. We can do all this with standard commands from official Stata.

. sysuse auto, clear
(1978 Automobile Data)

The first trick is to use -egen- functions to get the summary
statistics. (Those who know -egenmore- on SSC will know that
it has an -semean()- function, but we can get there without
it.)

. egen n = count(mpg), by(foreign rep78)

. egen mean = mean(mpg), by(foreign rep78)

. egen sd = sd(mpg), by(foreign rep78)
(1 missing value generated)

. gen se = sd / sqrt(n)
(1 missing value generated)

Naturally, all these functions are smart about missing values.

Now we fire up -tabdisp- to do the table. -tabdisp- is billed
as a programmer's command, but it is fine for many interactive
tasks. The one thing to grasp is that all it does is tabulate,
nicely: what you want shown must be worked out upstream. We
can get two-way tables (and more):

. tabdisp foreign rep78, c(n mean se)

----------------------------------------------------------------------
| Repair Record 1978
Car type | 1 2 3 4 5 .
----------+-----------------------------------------------------------
Domestic | 2 8 27 9 2 4
| 21 19.125 19 18.44444 32 23.25
| 3 1.328768 .7862783 1.528535 2 1.701715
|
Foreign | 3 9 9 1
| 23.33333 24.88889 26.33333 14
| 1.452966 .9043789 3.122499
----------------------------------------------------------------------

Clearly, the number of decimal places is dopey. Add a -format()-
(even 2 d.p. is probably silly):

. tabdisp foreign rep78, c(n mean se) format(%9.2f)

----------------------------------------------------
| Repair Record 1978
Car type | 1 2 3 4 5 .
----------+-----------------------------------------
Domestic | 2.00 8.00 27.00 9.00 2.00 4.00
| 21.00 19.13 19.00 18.44 32.00 23.25
| 3.00 1.33 0.79 1.53 2.00 1.70
|
Foreign | 3.00 9.00 9.00 1.00
| 23.33 24.89 26.33 14.00
| 1.45 0.90 3.12
----------------------------------------------------

Now the format is wrong for the counts. There is an
easy work-around. Make the -n- variable a string; then
it will be immune to any incantations of numeric format.

. tostring n, replace
n was float now str2

. tabdisp foreign rep78, c(n mean se) format(%9.2f)

----------------------------------------------------
| Repair Record 1978
Car type | 1 2 3 4 5 .
----------+-----------------------------------------
Domestic | 2 8 27 9 2 4
| 21.00 19.13 19.00 18.44 32.00 23.25
| 3.00 1.33 0.79 1.53 2.00 1.70
|
Foreign | 3 9 9 1
| 23.33 24.89 26.33 14.00
| 1.45 0.90 3.12
----------------------------------------------------

That's pretty much what -table- would give you. (In
fact, nicer.) Naturally you can leave out the column for
missing -rep78- if you wish.

There is some discussion of -tabdisp- in my

2003. Speaking Stata: Problems with tables, Part I.
Stata Journal 3(3):309--324.

Nick
n.j.cox@durham.ac.uk

Svend Juul
==========

Use -tabstat-; it is quite flexible:

. tabstat price, stat(n mean sem) by(foreign) col(stat)

Summary for variables: price
by categories of: foreign (Car type)

foreign | N mean se(mean)
---------+------------------------------
Domestic | 52 6072.423 429.4911
Foreign | 22 6384.682 558.9942
---------+------------------------------
Total | 74 6165.257 342.8719
----------------------------------------

Garry Anderson
==============

The table command does not accept se or sem (the standard error of the
mean) in the contents(clist) option. I think this would be very useful
because a summary of a variable often includes n mean and sem.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index