Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: Graph Bar: help!

From   "Nick Cox" <>
To   <>
Subject   RE: st: RE: Graph Bar: help!
Date   Wed, 28 Sep 2005 10:21:08 +0100

First, to repeat a point made gently by both 
Scott Merryman and myself, and explained 
prominently in the Statalist FAQ, please 
do _not_ use anything other than plain text 
to send posts to the list. No HTML, etc. 

My original reply to your question was 
longer and more detailed, but Scott's was 
more direct and nearer what you want. 

(1) Cf. Scott's solution, tweaked to show


. graph bar acexist, over(year) yla(0 .2 "20" .4 "40" .6 "60" .8 "80") 
	ytitle(percent with audit committee) 

This works because it is short-hand for 

. graph bar (mean) acexist, over(year) yla(0 .2 "20" .4 "40" .6 "60" .8 "80") 
	ytitle(percent with audit committee) 

(2) Cf. my solution 

I made up a dataset with exactly the frequencies 
given in your post. 

I guess the reason my code produces the wrong results
for you is that you have missing values you didn't 
tell us about and which my code assumed not to exist. 

More careful code is 

egen pc = sum(acexist), by (year)
egen total = sum(acexist < .), by(year)
replace pc = 100 * pc / total

but a more direct approach would be 

egen PC = mean(100 * acexist), by(year) 

and that doesn't need any tweaking to
ensure the right answer if missings are 

After that it is 

graph bar PC, over(year) ... 

Katarina Sikavica

dear nick cox and others:

I have just tried to do as suggested in your e-mail below; that is, since I am using stata 8 I typed:

egen pc = sum(acexist), by (year)
egen total = sum(1), by(year)
replace pc = 100 * pc / total

I have also tried:

bysort year: gen pc = sum(acexist)
bysort year: replace pc = 100 * pc[_N] / _N

.... but unfortunately I get twice the same wrong results:

tabdisp year, cell(pc) shows:

year pc

2000 12,09302
2001 28,50467
2002 62.61682
2003 71.02804
2004 74.4186

am I doing something wrong??? help!!

Nick Cox

Katarina's data are like this: 

. tab year acexist

          |        acexist
     year |         0          1 |     Total
     2000 |        29         26 |        55 
     2001 |        24         61 |        85 
     2002 |        69        134 |       203 
     2003 |        56        152 |       208 
     2004 |        50        160 |       210 
    Total |       228        533 |       761 

-graph bar- won't deliver the reduction she wants, at least not without
some preparation. The reason is a little technical. -graph bar- is based
mainly on a temporary reduction of the data using -collapse-, and
-collapse- doesn't offer that reduction.  (It is nearer the territory of
-contract-, but that is a different story.) 

There are various solutions to the problem. A first solution is to
generate your own percent variable and then plot that directly. Each
percent is, we recall, a numerator divided by a total, multiplied by

One easy way to get the total is using -egen, total()-. In Stata 8 and
earlier, the function here was -egen, sum()-, not -egen, total()-. 

. egen pc = total(acexist), by(year) 

. egen total = total(1), by(year) 

(what's 1 + 1 + 1 + ... + 1? the total number of observations)

. replace pc = 100 * pc / total 

Stata diehards would scoff at this as namby-pamby and do it with -by:-. 

. bysort year: gen pc = sum(acexist) 

. by year: replace pc = 100 * pc[_N] / _N 

Either way, we can check that we are on the right lines by 

. tabdisp year, cell(pc) 

    year |         pc
    2000 |   47.27273
    2001 |   71.76471
    2002 |   66.00985
    2003 |   73.07692
    2004 |   76.19048

Then the graph is a line away: 

. graph bar (mean) pc, over(year) ytitle(percent with audit committee) 
yla(, ang(h)) 


. twoway bar pc year, ytitle(percent with audit committee) 
yla(, ang(h)) barw(0.5) 

Another solution employs a user-written program -catplot- from SSC. You
can install that by 

. ssc install catplot 

-catplot- is just a wrapper for -graph bar- (or -graph hbar- or -graph
dot-). It merely grinds through some reductions not quite trivial
otherwise and then fires up a -graph- command. 

You can get a graph in one line with -catplot- without any prior
calculation, although in practice I get there through a sequence of
small experiments: 

. catplot bar acexist year, percent(year) stack asyvars  yla(, ang(h))  
yti(percent without and with audit committee) 
legend(order(1 "without" 2 "with")) 

A graph I like more follows a reversal of coding: 

. gen acexist2 = 1 - acexist

. catplot bar acexist2 year, percent(year) stack asyvars yla(, ang(h)) 
yti(percent with audit committee) legend(off) bar(2, bcolor(none)) 

The original announcement of -catplot- contains some related comment.


Katarina Sikavica (edited, mainly to ASCII from HTML) 

I have just started with Stata graphics and have the following problem
with -graph bar-:

I have a dataset that contains data on the existence of audit committees
-acexist-. In total there are 761 companies, 533 of them having an
audit committee, 228 not. I would like to draw a -graph bar- that shows
the increase in audit committee incidence over -year-.  Drawing a -graph
bar- on the increase in the number of audit committees works fine;
however, as the data from 2000 and 2001 are of poor quality I would like
to have percentages of audit committee incidence over the years
2000-2004 (that is: 47.27% (2000); 71.76% (2001); 66.01% (2002); 73.08%
(2003); 76.19% (2004)). Neither of the following commands leads to the
desired results:

. graph bar (sum) acexist, over (year) percent
. graph bar (sum) acexist, over (year) asyvar percentages

*   For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index