Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Multiple (overlaid) Histogram


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Multiple (overlaid) Histogram
Date   Wed, 28 May 2003 16:37:45 +0100

J. Michael Oakes

> Using Stata 8, I want to produce a single histogram for
> MULTIPLE x-point
> Likert-scale variables Y1 Y2... Yn (n << 5 for clarity).
> That is, I'd like
> to compare discrete distributions side by side with
> something like this
> (hopefully not mangled example) for two variables Y1 and Y2
> over a 3+ point
> Likert scale...
>
>
>        |
>        |	         ______
>        |	         |    |	        ______
>     p  |	         |	  |	        |	 |
>     e  |      ______ |    |           |	 |
>     r  |      |    | |    |           |    |
>     c  |      |    | |    |           |    |
>     e  | _____| Y2 | |    |      _____| Y2 |
>     n  | |    |    | |    |_____ |    |    |
>     t  | | Y1 |    | | Y1 |    | | Y1 |    |
>        | |    |    | |    | Y2 | |    |    |
>        | |    |    | |    |    | |    |    |
>        -----+-------------+-----------+---------->>
>             1		  2             3
>
>
> My data is a conventional structure where respondents are
> indexed by rows
> and outcome variables are in columns, such as:
>
>
> 	_n	 Y1     Y2
>     -----   ----  -----
>  	1	 1     2
>       2	 2     1
>       3      1     3
> 	.      .     .
>
>
> Given such data, my sense is that the desired graph is
> technically two
> histograms overlaid (like in <twoway>) on each other. But
> since <histogram>
> is not a <twoway> plot such a histogram is not possible: currently
> <histogram> can only plot one variable at a time (I think).
>
> While I can imagine transforming my data through some
> complicated collapse
> and append commands, and then using <twoway bar>, this is
> simply too much
> work. Relatedly, I could use Excel and other such programs
> to produce the
> plot easily with summarized data. But I really want to
> avoid dumping data to
> another program, especially Excel.
>
> Am I again missing something, or does <histogram> need some
> improvement?

-histogram- needs no improvement. It is perfect. (No, I didn't write
it.)

More seriously, this touches upon some issues flagged on Statalist
earlier this year.

Part of the issue may be terminological, as in a concurrent thread.

1. I take a histogram, strict sense, to refer to a display of
frequencies
(fractions, densities) of a continuous variable divided into classes
(bins).
The hallmark of a histogram as produced by proper statistical software
is
that adjacent bars touch. (If this isn't true, you haven't got a
proper
histogram, or you haven't got proper statistical software.)
Whatever my terminology, my guess is that
(a) it is pretty standard statistically
and
(b) (more important here) this is the problem for which -histogram- in
Stata 8 is
optimised. Away from this problem, you have to coax it to do what you
want. If you want bars not to touch, you have to insist on that. If
you want bars to be given value labels, ditto. However, -histogram-
won't do what you want, or so I believe.

2. What you want is, depending on how strict one is about terminology,
either two superimposed histograms of categorical variables or
a bar chart showing the percents of two categorical variables.
The latter is available in principle as an application of -graph bar-
(-twoway bar- is not, I guess, the way to go) but you have to do
some preparation yourself.

An alternative is to use -catplot- from SSC which is not purrfect but
seems close to your problem.

I did this given two variables -y1- and -y2-

preserve
gen id = _n
reshape long y, i(id)
catplot bar _j y, percent(_j) asyvars
restore

Notes:

1. You may need to do some renaming or labelling in your problem.

2. -asyvars- is there to get touching bars. If you don't want
them, don't specify it.

3. I am not clear how you want your percents calculated but
-catplot- offers a handle to specify it.

4. -catplot hbar- and -catplot dot- are also available.


Nick
n.j.cox@durham.ac.uk

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2020 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index