[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Histogram with options
on 13/8/02 12:56 PM, k.gilland@Queens-Belfast.AC.UK at
> I can never seem to get the histogram syntax quite right – have experimented
> for ages with the syntax.
There is a deeper problem than that: histograms are sensitive to both the
bin width and the start interval, and minor changes to this can result in
very different visual displays.
In fact, histograms make complex shapes which are unsuited to the display or
comparison of data, as you cannot tell which features are genuinely driven
by the data and which by the choice of bins and starting values.
Try this: take a variable which has values containing one decimal place. Do
a histogram using bins 2.5 units wide. You will see a nice alternation of
higher and lower bars, which can be reversed by moving the start of the
first bin by one unit. And I'm citing an example actually published in the
When a front-line medical journal publishes a histogram which shows a
pattern that is entirely artefactual, you can take it as a demonstration
that no-one reads histograms and no-one can interpret them.
So what to use instead?
For examining and comparing, start with boxplots.
For looking at the shape of the data, check out cumulative distribution
graphs (see -ordplot- for instance) and violin plots (see -violin-).
As an alternative to histograms, have a look at kernel smoothers. They tend
to tune out the artefacts that bedevil histograms.
Sorry - I know you asked how to do something, and I've given one of those
awful 'try doing something else' responses, but I really believe that the
last couple of decades has produced a number of really useful alternatives
to the histogram.
Ronan M Conroy (email@example.com)
Lecturer in Biostatistics
Royal College of Surgeons
Dublin 2, Ireland
+353 1 402 2431 (fax 2329)
And now, Mr President, how about the global alliance against climate change?
* For searches and help try: