Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Advanced features for bar chart and histogram in Stata |

Date |
Sun, 21 Aug 2011 13:53:31 +0100 |

FIRST My guess is that you would get closer to what you want by -reshape long- followed by use of -over()- options. SECOND The key here is that you can have different colours so long as you have different variables. I have in preparation a Stata Tip on this topic which I append below. Highlighting specific bars A frequent need when drawing a bar or dot chart is to highlight a subset of observations while keeping the overall sort order. The stipulation of keeping the overall sort order is what provides the challenge here, as otherwise we could just add subdivision by another variable to the command, as when distinguishing foreign cars among those with the best repair record: . sysuse auto, clear . graph hbar (asis) mpg if rep78 == 5, over(make, sort(1) descending) . graph hbar (asis) mpg if rep78 == 5, over(make, sort(1) descending) over(foreign) nofill Figure 1 shows graphs for these two commands. In Figure 1(a), the ordering is within all the observations specified. In Figure 1(b), the extra option -over(foreign)- subdivides observations according to the further variable -foreign-. Note also the crucial detail of -nofill-. This can be a useful kind of graph, but it is not what is wanted here. Let us suppose we have data on basin (catchment or watershed) areas for various large rivers in the world, and we want to show where the Mississippi comes in the rank order for the very largest. Some example data from Allen (1997) are included with the media for this issue. . use rivers Figure 2 as a first graph shows that the Mississippi ranks third on area of basin in this dataset, after the Amazon and Nile. . graph hbar (asis) area if area >= 1000, over(name, sort(1) descending) Highlighting a particular bar means giving it a different color. Some acquaintance with the bar chart commands shows that they are willing to combine bars for different variables, which will be assigned different colors, so the need is simply to put data for two subsets, the Mississippi and the others, into two different variables. -separate- is a command designed for precisely this purpose. For other graphical applications of -separate-, see Cox (2005). It is naturally also possible to use -generate- directly. . separate area, by(name == "Mississippi") In this example, the equality supplied to -by()- is either false or true, numerically 0 or 1, and so -separate- creates two new variables, -area0- and -area1-. . graph hbar (asis) area0 area1 if area >= 1000, nofill /// over(name, sort(area) descending) legend(off) ytitle("`: var label area'") We are plotting bars for values that are non-missing on -area0- and missing on -area1-, or vice versa. But -graph- plots no bars when values are missing. This is easy to fix: -nofill- gets us the intended effect. In this case, we suppressed the legend, imagining that, depending on the purpose, we could add a title for a presentation, as say title(Mississippi ranks third in catchment area) or underline the message of the graph in informative text supplied in a text or word processor. As two response variables are being shown on the same graph, we have to step in to provide an informative y axis title, in this case by automating use of the variable label for -area-}. Nothing stops us just providing a title explicitly, as when no such variable label has been defined. In principle, using -stack- should have the same effect as using -nofill-. In practice, there can be small complications if there are other missing values in the data, which are fixable with an appropriate -if- exclusion. The main problem now being solved, we could clearly heighten the contrast, as by adding -bar(1, bfcolor(none))-. Figure 3 shows the graph after that tweak. Similar needs are met by variations on this theme. In our example, the subset to be highlighted is a single observation, but nothing depends on that being true. Equally, three or more subsets could be distinguished. For a more elaborate subdivision we might want a legend, although there is a trade-off: the more complicated and elaborate the design, for which a legend becomes necessary, the less the impact of the graph is likely to be. The examples all are based on showing values -asis-. If graphs of this kind are needed, but for means or other summary statistics, it is often easiest to -collapse- or -contract- the dataset first, and then use -separate- and -graph hbar (asis)-. The same device can be used with -graph bar-, -graph dot- or various subcommands of -twoway- such as -twoway bar-. In practice, when we want this, the individual observations include names that are informative, so horizontal alignment makes those names more readable. If -graph dot- were to be used, we should consider heightening the contrast, as by adding -marker(2, msize(*3))-. Allen, P.A. 1997. Earth Surface Processes. Oxford: Blackwell Science. Cox, N.J. 2005. Stata tip 27: Classifying data points on scatter plots. Stata Journal 5(4): 604--606. On Sun, Aug 21, 2011 at 12:22 PM, Fredrik Norström <fredrik.norstrom@epiph.umu.se> wrote: > Dear Statalist users, > > I am struggling with generating histograms like I want them to be. To avoid a lot of manual extra work I would rather like to solve it so that Stata generates it for me. I am doubtful if that is possible but hope that at least someone can verify that for me if so. > > FIRST PROBLEM: > In my questionnaire I have asked about symptoms before and after disease diagnosis. I want to generate a graph that includes 3 different symptoms (heartburn, nausea and vomiting) before and after diagnosis and the proportion of users with major problem for them. I have figured out that I can use > "graph bar (mean) Heartburn_before Heartburn_after Nausea_before Nausea_after Vomiting_before Vomiting_after" to generate such a graph but that graph does not look like I want it to be. > > I want it to be: > 1) First two bars side-by-side for heartburn (to left before diagnosis and right after diagnosis) then a gap and those two for nausea side-by-side, a gap and those two for vomiting side-by-side. I have tried to use bargap but then all bars will have a difference between each other which is not what I am interested in. > 2) Below bars for heartburn I want to have label "Heartburn" and similarly for nausea and vomiting. No idea how to do this right now without manually doing it in paint or another simple graphical program where I risk losing picture quality as well as having to redo everything if changes are necessary. > 3) In upper right at graph I would like to have the legend with the labels "Before diagnosis" and "After diagnosis" with the colour of each of these bars. For every symptom I will have same colour for before and after diagnosis. This issue I know how I easily could solve by editing graph in Stata but would be nice to learn the code for how to specify these offsets. > > I hope that I despite the lack of an illustration have managed to explain what I am interested in. > > SECOND PROBLEM: > The other graph I am preparing for my paper looks at relation between diagnosis for two diseases. The graph should illustrate what disease that causes the other one. Also this problem I have solved it with histogram option in Stata. However, I am interested in making the graph more advanced. I would like it to be light color if one disease occurs before the other and a dark color if other disease occurs first. Is it possible to have a legend for a histogram where colors are different for different values? > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Advanced features for bar chart and histogram in Stata***From:*Fredrik Norström <fredrik.norstrom@epiph.umu.se>

- Prev by Date:
**Re: st: gender wage gap decomposition** - Next by Date:
**st: Using Logit with Endogenous variables** - Previous by thread:
**st: Advanced features for bar chart and histogram in Stata** - Next by thread:
**st: gender wage gap decomposition** - Index(es):