[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: RE: Multiple (overlaid) Histogram
J. Michael Oakes
> Using Stata 8, I want to produce a single histogram for
> MULTIPLE x-point
> Likert-scale variables Y1 Y2... Yn (n << 5 for clarity).
> That is, I'd like
> to compare discrete distributions side by side with
> something like this
> (hopefully not mangled example) for two variables Y1 and Y2
> over a 3+ point
> Likert scale...
> | ______
> | | | ______
> p | | | | |
> e | ______ | | | |
> r | | | | | | |
> c | | | | | | |
> e | _____| Y2 | | | _____| Y2 |
> n | | | | | |_____ | | |
> t | | Y1 | | | Y1 | | | Y1 | |
> | | | | | | Y2 | | | |
> | | | | | | | | | |
> 1 2 3
> My data is a conventional structure where respondents are
> indexed by rows
> and outcome variables are in columns, such as:
> _n Y1 Y2
> ----- ---- -----
> 1 1 2
> 2 2 1
> 3 1 3
> . . .
> Given such data, my sense is that the desired graph is
> technically two
> histograms overlaid (like in <twoway>) on each other. But
> since <histogram>
> is not a <twoway> plot such a histogram is not possible: currently
> <histogram> can only plot one variable at a time (I think).
> While I can imagine transforming my data through some
> complicated collapse
> and append commands, and then using <twoway bar>, this is
> simply too much
> work. Relatedly, I could use Excel and other such programs
> to produce the
> plot easily with summarized data. But I really want to
> avoid dumping data to
> another program, especially Excel.
> Am I again missing something, or does <histogram> need some
-histogram- needs no improvement. It is perfect. (No, I didn't write
More seriously, this touches upon some issues flagged on Statalist
earlier this year.
Part of the issue may be terminological, as in a concurrent thread.
1. I take a histogram, strict sense, to refer to a display of
(fractions, densities) of a continuous variable divided into classes
The hallmark of a histogram as produced by proper statistical software
that adjacent bars touch. (If this isn't true, you haven't got a
histogram, or you haven't got proper statistical software.)
Whatever my terminology, my guess is that
(a) it is pretty standard statistically
(b) (more important here) this is the problem for which -histogram- in
Stata 8 is
optimised. Away from this problem, you have to coax it to do what you
want. If you want bars not to touch, you have to insist on that. If
you want bars to be given value labels, ditto. However, -histogram-
won't do what you want, or so I believe.
2. What you want is, depending on how strict one is about terminology,
either two superimposed histograms of categorical variables or
a bar chart showing the percents of two categorical variables.
The latter is available in principle as an application of -graph bar-
(-twoway bar- is not, I guess, the way to go) but you have to do
some preparation yourself.
An alternative is to use -catplot- from SSC which is not purrfect but
seems close to your problem.
I did this given two variables -y1- and -y2-
gen id = _n
reshape long y, i(id)
catplot bar _j y, percent(_j) asyvars
1. You may need to do some renaming or labelling in your problem.
2. -asyvars- is there to get touching bars. If you don't want
them, don't specify it.
3. I am not clear how you want your percents calculated but
-catplot- offers a handle to specify it.
4. -catplot hbar- and -catplot dot- are also available.
* For searches and help try: