Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: -catplot- available for download from SSC

From   "Nick Cox" <>
To   <>
Subject   st: -catplot- available for download from SSC
Date   Fri, 21 Feb 2003 12:56:38 -0000

Thanks to Kit Baum, a new package -catplot- 
has been posted on SSC. This is for plots 
of categorical data in Stata 8, specifically 
for bar or dot charts of the same showing 
frequencies, or fractions, or percents. 

(For Stata 7 or earlier there are other 
user-written programs available in the same
territory, such as -fbar-, -tabhbar-, -vbar-.) 

Those who have looked at Stata 8's new 
graphics may well ask: Surely all that is very 
well done in Stata 8, with -graph bar-, -graph 
hbar- and -graph dot- offering a great range of 

The answer is "Yes indeed", and that is 
what I am building on, the aim being to add 
a convenience command in one particular 

I work a lot with students and others who want bar 
charts of categorical data, for example, of counts 
of categories from one-way, two-way or even three-way 
tables from questionnaires and other survey data. 
In addition, many of these users want to tell me 
for some reason that it's very easy in Excel, so 
I really want to be able to say to them that it's 
also very easy in Stata. 

How does Stata size up on this task? 

1. -histogram- is optimised for histograms, 
naturally. It can be used for this purpose by 
invoking options like 

, discrete xla(, valuelabel ang(45)) gap(50) 

for a one-way table or 

, discrete xla(, valuelabel ang(45)) gap(50) 
by(myvar, rows(1)) 

for a two-way table. Typing this -- or issuing 
the equivalent through a dialog -- is a 
little more complicated than some Stata beginners
might expect for this task. In any case, 
some problems then frequently arise: 

a. it doesn't take much for value labels to become 
unreadable or to require what I call giraffe graphics, 
in which the graphic necessitates a great deal of neck 
movement. (That's why I have "ang(45)" in the examples 

b. The number of cells you can show easily and effectively 
appears to be ~20, given that you will want value 
labels shown to indicate the categories. Any long 
value labels make this problem worse. 

c. Representing a 3-way table seems impossible, except by 
producing and then combining separate histograms. 

2. -graph hbar- etc. is good _if_ the 
frequencies come predefined as a variable, because
then you can just sum the frequencies. But 
if you want Stata to do the counting for you, 
this seems to require you to set up something 
to count. In particular, 

. graph hbar (count) rep78 

doesn't give you the frequencies of the 
categories of -rep78-. Roughly, we want -graph- 
here to -contract-, not -collapse-. 

The way to do it is to calculate something in 
advance, as in 

. gen freq = 1 
. graph hbar (count) freq, over(rep78) 

but arguably we shouldn't have to do that. 
And as for percents, catching missings, 
and working with -if- and -in-: it 
really needs a program. 

So that's the rationale for -catplot-. What it 
actually does can be seen by reading the help

. ssc type catplot.hlp 

and then if interested you can install 

. ssc inst catplot 

P.S. choosing good names is not always 
easy. Perhaps this one is down partly 
to the fact that I like cats. 
*   For searches and help try:

© Copyright 1996–2019 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index