Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Graphing a categorical variable: simple bar chart.


From   Ronan Conroy <rconroy@rcsi.ie>
To   "statalist hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: RE: Graphing a categorical variable: simple bar chart.
Date   Tue, 18 Nov 2003 10:38:27 -0000

on 17/11/2003 18:07, Nick Cox at n.j.cox@durham.ac.uk wrote:

> Ernest Berkhout has already answered
> "histogram", and that's probably the best
> short answer. 

One of the things that Stata wonąt do for you, which is a terrible lesson to
anyone who tries it, is allow you to see what happens as you change
* the bin width and
* the start value
for a histogram. I give a pretty disconcerting demonstration with JMP (which
I use for EDA) in which I do each of these things in turn, with the
histogram redrawing in real time. The result is that you can make a
histogram change shape utterly and in a totally arbitrary way.

This is a real problem. We assume that features of graphical displays are
data features, and the differences between displays are differences in the
data displayed. This simply isn't so with the histogram.

Kernel smoothing is a better solution, but again you need to be able to
'tune' the smoother to view coarse and fine detail.

Boxplots at least show you definable features of your data. Various
extensions of these which also show data density have been proposed: violin
plots (Stata version by Thomas J. Steichen) are promising, but have been
criticised for the potentially misleading results of kernal smoothing. An
alternative, box percentile plots, has been proposed by Esty and Banfeld
(download it from here:
http://www.jstatsoft.org/v08/i17/BoxPercentilePlot.pdf)
which looks good to me, but I am not aware of anyone taking it up.

Finally, Nick Cox has done extensive work on univariate displays, notably on
-catplot- and -distplot-. The whole topic of how you display univariate data
depends heavily on *why* you want to display it.

But for sure, no matter what you want to do, the histogram is inferior to at
least one of the alternatives. It's kinda like that tank division: it
appears to offer a solution to all the problems of the world. Don't be
fooled.

Ronan M Conroy (rconroy@rcsi.ie)
Lecturer in Biostatistics
Royal College of Surgeons
Dublin 2, Ireland
+353 1 402 2431 (fax 2764)

--------------------


--------------------------------------------------------------------------------------------------------------------
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom
they are addressed.
If you have received this email in error please notify the
originator of the message. This footer also confirms that this
email message has been scanned for the presence of computer viruses.

Any views expressed in this message are those of the individual
sender, except where the sender specifies and with authority,
states them to be the views of The Royal College Of Surgeons in Ireland.

--------------------------------------------------------------------------------------------------------------------



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index