Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Re: Bar labels in stacked bar chart

From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: Re: Bar labels in stacked bar chart
Date   Mon, 12 Apr 2004 18:17:20 +0100

Yes, you can do this, but not on the fly, so 
far as I am aware. 

The re-ordering you achieve with -graph bar- can be achieved 
(only?) by generating indicator variables and then supplying 
them in the desired order. 

The analogous procedure needed for -catplot- is to provide 
a variable giving the desired sort order, say by 

. recode rep78 3=1 4=2 5=3 1=4 2=5, gen(Rep78) 
(69 differences between rep78 and Rep78)

Then you need something like 

. catplot bar rep78 foreign, percent(foreign) asyvars stack 
	oversubopts(sort(Rep78)) legend(order(3 4 5 1 2)) 

-oversubopts()- is an undocumented option. Also, make 
sure that you have the latest public domain -catplot-, 1.1.2. 

To put this in context, let me refer you to the original 
Statalist posting (recombine this URL)

on 18 February 2003. Here it is, in part: 

Thanks to Kit Baum, a new package -catplot- 
has been posted on SSC. This is for plots 
of categorical data in Stata 8, specifically 
for bar or dot charts of the same showing 
frequencies, or fractions, or percents. 

(For Stata 7 or earlier there are other 
user-written programs available in the same
territory, such as -fbar-, -tabhbar-, -vbar-.) 

Those who have looked at Stata 8's new 
graphics may well ask: Surely all that is very 
well done in Stata 8, with -graph bar-, -graph 
hbar- and -graph dot- offering a great range of 

The answer is "Yes indeed", and that is 
what I am building on, the aim being to add 
a convenience command in one particular 

I work a lot with students and others who want bar 
charts of categorical data, for example, of counts 
of categories from one-way, two-way or even three-way 
tables from questionnaires and other survey data. 
In addition, many of these users want to tell me 
for some reason that it's very easy in Excel, so 
I really want to be able to say to them that it's 
also very easy in Stata. 

How does Stata size up on this task? 

1. -histogram- is optimised for histograms, 
naturally. It can be used for this purpose by 
invoking options like 

, discrete xla(, valuelabel ang(45)) gap(50) 

for a one-way table or 

, discrete xla(, valuelabel ang(45)) gap(50) 
by(myvar, rows(1)) 

for a two-way table. Typing this -- or issuing 
the equivalent through a dialog -- is a 
little more complicated than some Stata beginners
might expect for this task. In any case, 
some problems then frequently arise: 

a. it doesn't take much for value labels to become 
unreadable or to require what I call giraffe graphics, 
in which the graphic necessitates a great deal of neck 
movement. (That's why I have "ang(45)" in the examples 

b. The number of cells you can show easily and effectively 
appears to be ~20, given that you will want value 
labels shown to indicate the categories. Any long 
value labels make this problem worse. 

c. Representing a 3-way table seems impossible, except by 
producing and then combining separate histograms. 

2. -graph hbar- etc. is good _if_ the 
frequencies come predefined as a variable, because
then you can just sum the frequencies. But 
if you want Stata to do the counting for you, 
this seems to require you to set up something 
to count. In particular, 

. graph hbar (count) rep78 

doesn't give you the frequencies of the 
categories of -rep78-. Roughly, we want -graph- 
here to -contract-, not -collapse-. 

The way to do it is to calculate something in 
advance, as in 

. gen freq = 1 
. graph hbar (count) freq, over(rep78) 

but arguably we shouldn't have to do that. 
And as for percents, catching missings, 
and working with -if- and -in-: it 
really needs a program. 


That is, -catplot- started out as the 
simplest program I could devise for bar 
charts of frequencies of categorical data -- 
given also that people do want two or more variables, 
percent calculations, etc., etc. 

Since birth, -catplot- has accumulated 
extra features both because users reasonably 
suggested them and because I found myself wanting
them. This creates a small worry for the author, 
as the original intent was to keep it very simple.
So -oversubopts()- was kept undocumented partly 
as an experiment to see how often it is needed, 
although some smart users looked at the code and
discovered that it was there. 

[email protected] 

> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of Friedrich
> Huebler
> Sent: 12 April 2004 05:31
> To: [email protected]
> Subject: st: Re: Bar labels in stacked bar chart
> Nick,
> Thank you, -catplot- is easier to use than -graph bar- because
> several steps can be skipped. Another advantage is that -catplot-
> works with numeric values and strings.
> However, with -graph bar yvars, over(varname) stack- it is possible
> to stack the bars in a specific order by varying the order in which
> the yvars are listed.
> . sysuse auto
> . tab rep78, gen(rep)
> . graph bar rep3 rep4 rep5 rep1 rep2, over(foreign) stack percent
> Can the same be accomplished with -catplot-?
> Thank you.
> Friedrich Huebler
> --- Nick Cox <[email protected]> wrote:
> > Another way to do it is (in total) 
> > 
> > . catplot bar rep78 for , percent(for)  asyvars stack 
> > 	blabel(bar, pos(center) format(%3.2f))
> > 
> > where I've added a format control. Here -catplot- can be 
> > installed from SSC.
> __________________________________
> Do you Yahoo!?
> Yahoo! Tax Center - File online by April 15th
> *
> *   For searches and help try:
> *
> *
> *

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index