Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

re: st: easy histogram


From   David Airey <david.airey@vanderbilt.edu>
To   statalist@hsphsun2.harvard.edu
Subject   re: st: easy histogram
Date   Sun, 2 Mar 2003 08:41:47 -0600

That's interesting. I've never seen such use. I would think the proper way to deal with this is to log the variable and histogram it, without the xscale(log) option! In that case you get two meaningful axes to inspect. I was thinking that twoway options were being allowed by design too liberally, and should not be, because there is a difference between a plot like a scatter where x and y can be whatever and independent of each other, and plots like histograms, where y is a function of x or calculated from x. Those are not really a plot of two independent variables; in all of those cases, logging the xscale will break the meaning of the graph. I was thinking in those cases, if there are others, the xscale option is no useful. But these thoughts were vague (and somewhat like list) not really that important compared to speeding up graphics and improving estimation methods. Instead I'll look forward to the time when I might need xscale(log) and not have to ask that it be added!

-Dave



> Perhaps not all daughters of the mother twoway should
> inherit certain
> twoway options?

This in turn touches on various tricky design issues, one
being how far statistical software designers (a) should
and (b) can decide ex cathedra which kinds of graph
are inadmissible or inappropriate, especially when what
may seem crazy in one field may turn out to have
a specific rationale in another. Excellence comes
easily to Stata Corp, but omniscience is an asymptotic
property.

I don't know a strong case for binning on the original scale,
yet showing the results with -xscale(log)-. However, blowing
up the left-hand part of the scale like this
might have some private use for examining fine structure.
For example, I have worked with glacier area data which
tend to be very heavily skewed and problematic at the lower end.
Among other issues, it can be difficult to distinguish, especially
without a field visit, between a true glacier and an inert body,
and different scientists compiling area data (usually in some
national agency office) tacitly show different degrees
of scepticism in distinguishing glaciers and non-glaciers.
For such a problem, graphs of the kind discussed might have some
private value, as there is often merit in a scale which uses the
units familiar to researchers. I wouldn't publish such
a histogram myself, but it might be of some use.

More generally, the principle that the area under the histogram
should integrate to 1 -- or to the number of values --
is clearly a good one. However, it is not the only criterion.

Plotting log frequency vs log magnitude is
common in sedimentology. R.A. Bagnold did this
in his classic book on wind-blown sand in 1941
and appropriate hyperbolic distributions have since been
investigated by O. Barndorff-Nielsen and others. Those
ideas appear to be drifting into other areas such as
financial modelling.

Nick
n.j.cox@durham.ac.uk
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index