Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: Making bar charts readable in a grayscale photocopy.

From   Lee Sieswerda <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   st: RE: RE: Making bar charts readable in a grayscale photocopy.
Date   Mon, 27 Oct 2003 15:27:13 -0500

Nick already answered your question about getting a CI on a barchart. He's
quick! Of course, he's probably also recipient number one on the listserver,
so I think he gets postings about 40 minutes before I do.

Anyway, you have the worst kind of data to graph: sparse and categorical and
you want to show all of it. But there are things you can try.

1. It sounds like you have categories that sum to 100%. This is an advantage
because, if so, you can omit the largest category and instead concentrate on
the smaller numbers. This is basically what Nick suggested earlier when he
wrote about putting the data on different scales.

2. Since no one responded to category 4, you can easily combine it with one
of the other categories and not lose any information. (Either that or you've
just made a coding error and there is no category 4).

3. Given #1 above, you can use a stacked bar chart. This is often a
deprecated graph type, for good reason, but I think it could be useful here
(desperate data calls for desperate chart types). Another often deprecated
feature that you should exploit because of your missing data is line
connectors between the stacked categories (often considered chart junk, but
necessary here).  Stack the categories so that they add up to (100% -
category0). Use a gray gradient so that the highest level (black) is at the
right-hand side. The line connectors should make it clear where the missing
data is. And finally, the order of the countries will be critical to making
the graph readable. I don't know what pattern you are trying to show, so I
can't say exactly what I'd do, but here is a first stab: try putting country
number 7 first and put text labels on top of each of the bar segments to
establish the categories. Arrange the rest of the countries below in such an
order that the line connectors make clear any pattern you are trying to

4. I would find this challenging to do in Stata, although I have no doubt it
can be done. To save myself many hours of fiddling, I'd probably use Excel
(sshhhh... don't tell anyone).

5. If there is no clear pattern that you are trying to show, then forget
everything I've said above, and instead put the data in a table in an
appendix at the back! A graph is for showing a pattern, not for merely
displaying raw data. That's what tables are for.

This will make an ugly graph. It wouldn't be suitable for publication in a
journal probably. If you can't get anything that looks clear, put it in a
table instead.


Lee Sieswerda, Epidemiologist
Thunder Bay District Health Unit
[email protected]

> -----Original Message-----
> From: Daniel R Sabath [mailto:[email protected]] 
> Sent: Monday, October 27, 2003 1:50 PM
> To: [email protected]
> Subject: st: RE: Making bar charts readable in a grayscale photocopy.
> Lee,
> Thank you for your answer and several good points.
> I suppose I should have obfuscated the data a little 
> differently other than just stripping off the data labels. 
> What that bargraph represents is a summary of data about 
> similar but mutually exclusive information over 14 different 
> and wide spread geographic regions. You could think of it as 
> level of education completed (middle school, high school, 
> undergrad, masters, PhD) by country, expressed as a 
> percentage of the countries population and not be too far off. 
> You mentioned CI and ORs. I'd like to know how to get 
> confidence intervals onto a barchart.
> Many thanks,
> Dan
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of 
> Lee Sieswerda
> Sent: Monday, October 27, 2003 7:29 AM
> To: '[email protected]'
> Subject: st: RE: RE: RE: Making bar charts readable in a 
> grayscale photocopy.
> Daniel,
> I see your problem, the graph you posted is pretty awful. 
> The problem you are trying to solve is a common one in the 
> presentation of survey results. The graph you show looks 
> similar to a series of fourteen related questions with 
> responses coded on a scale of 1 to 6. Having been faced with 
> this situation many times, I think the issue is not one of 
> technology, but rather of making choices about what data to 
> present and what to suppress. Thus, it would be easier to 
> help if it were clear what the categories actually are, 
> rather than just providing numeric codes.  
> Some general points to consider are:
> 1. Can you combine any of the 6 categories? Your first 
> category is clearly dominant. The general idea here is to 
> reduce the amount of data "clutter" and focus on showing the 
> dominant pattern in your data. All those little bars may just 
> be needless detail (depending on what they represent). If you 
> need to provide all of that detail, then perhaps you should 
> provide a table rather than a graph. Remember that graphs are 
> meant to summarize data, and need not provide all of the gory details.
> 2. Do all of the fourteen "over" groups need to represented 
> on a single graph? Do they fall into natural groups that 
> could be split out over a series of figures?
> 3. With regard to my earlier point about using a different 
> type of graph: is it possible to show these results as a 
> series of effects? For example, could you reduce the amount 
> of data by graphing, say, odds ratios and confidence 
> intervals rather than the individual data points? If you can 
> do this, then you will be showing effects directly rather 
> then leaving your reader to infer the effect from the raw data points.
> 4. There is a type of graph that represents the contents of 
> an r x c table as a rectangular array of variously sized 
> boxes (the bigger the cell number, the larger the box). The 
> idea here is to directly translate tabular data into a 
> graphical form. As far as I know, this is not implemented in 
> Stata, but Nick Cox may know better. It is implemented in R 
> and I searched for about 20 minutes to find it, but can't 
> seem to put my finger on it. Perhaps someone else knows the 
> name of the command. EpiInfo implements this graph type for 2 
> x 2 tables, but strictly, I think, as an aid to the 
> numerically-challenged trying to solve an outbreak. It 
> probably has some better use in larger r x c tables, but I'm 
> hesitant to even mention it because it is probably not really 
> a solution. The fact that such graphs are rarely published or 
> even implemented is perhaps an indication that they lack 
> resonance in the scientific community.
> That's the best I can do for you without knowing what your 
> data actually represent.
> Regards,
> Lee
> Lee Sieswerda, Epidemiologist
> Thunder Bay District Health Unit
> [email protected]
> <snip>
> *
> *   For searches and help try:
> *
> *
> *
*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index