# Re: st: how to make an area graph showing distribution?

 From "Martin Weiss" To Subject Re: st: how to make an area graph showing distribution? Date Sun, 30 Nov 2008 14:46:43 +0100

No need to say "sorry", the list is there - in particular- for insidious problems. So could you post some kind of example dataset with instructions of what you are trying to accomplish? The dotplot was a guess in the absence of an example...
```
HTH
Martin
_______________________
```
----- Original Message ----- From: "Gisella Young" <gisellayoung@yahoo.com>
```To: <statalist@hsphsun2.harvard.edu>
Sent: Sunday, November 30, 2008 2:43 PM
Subject: Re: st: how to make an area graph showing distribution?

```
Thank you for your replies. However sorry to come back but I am still stuck and wonder whether I could bother people for further advice. On Maarten Buis's suggestion, I am not sure why I would really need a regression - I get from his email that this is basically for smoothing? Since I actually want to plot the actual data (but realise that this needs smoothing), what I would prefer to do would be to have income plotted in for example percentiles (on the x-axis) showing for each percentile the composition of that income percentile in terms of occupation on the y-axis.
```
```
I guess one way would be a histogram or bar chart, but what I really want is a continuous area plot with percentiles of income on the x-axis and percentage in each occupation (for each income percentile) on the y-axis.
```
```
I'm not sure whether a dot graph as Martin Weiss suggests will really do this, I've looked into it but it seems quite different unless I am misunderstanding? Also, I am struggling with even the first step of creating another variable first with the proportions of each occupation for each income group (eg percentile). I have tried functions such as sumdist, pctile, and xtile (downloaded from SSC) but they are not dividing the population into equally sized percentile groups. I have tried for instance -sumdist income if date==2007 [fw=weight], n(100) qgp(test)- but the groups are not of the same size.
```
```
I'm hopeful that there must be a simple way to do this, in part because in Excel it can be done in a few minutes (but excel of course can't handle large survey data as I am dealing with). Sorry to bother the list with these follow-up enquiries.
```
best,
Gisella

--- On Sun, 11/30/08, Maarten buis <maartenbuis@yahoo.co.uk> wrote:

```
```From: Maarten buis <maartenbuis@yahoo.co.uk>
Subject: Re: st: how to make an area graph showing distribution?
To: statalist@hsphsun2.harvard.edu
Date: Sunday, November 30, 2008, 10:52 AM
--- Gisella Young wrote:
> I am trying to make a chart showing the distribution
of income by
> occupation. On the x-axis I would like the
distribution of income
> from 0 to the highest. Then on the y-axis I want to
show the
> proportion of people in different occupations. I have
a variable
> (occup) with 6 different occupational categories. In
other words, I
> want to show how the different occupations fit into
income
> distribution, by showing how the occupational
breakdown of income
> changes moving up the income spectrum. I thought an
area chart
> (summing to 100) would be the best way to do this,
although there
> might be better ways which I would be open to
suggestions. I have
> tried the twoway area function with different
variations, but it
> doesn't seem to be right (just gives a crazy chart
with lines all
> over) and I'm not sure how to do it.

You'll probably need to smooth the proportion as you
won't have for
each wage in your data enough cases within each
occupational category
to reliably estimate the proportions. In the example below
I have done
so by estimating a -mlogit- predicting occupational
catagory with a
wage represented as a restricted cubic spline (see -help
mkspline-). I
treat the predicted probabilities as the smoothed
proportions.

For the graph I created the variables zero, one, and l1
till l5. The
logic is that on the y-axis the first band should range
from 0 (zero)
to the first proportion (l1), on second band should start
at the first
proportion and end at the first + the second proportion,
etc. Two
things are worth noting: 1) you need to sort first on wage
(or use the
sort option) to avoid creating modern art, and 2) I
reversed the order
in the legend (going from 6 to 1) so that the order in
which they
appear in the legend corresponds with the order in which
they appear in
the graph (1 at the bottom and 6 at the top).

*--------------- begin example -----------------------
// prepare the example data
sysuse nlsw88, clear
gen ind_gr = industry
recode ind_gr 1/5=1 6=2 7=3 8/10=4 11=5 12=6
label define ind_gr 1 "manual"                ///
3 "finance"               ///
4 "other services"        ///
5 "professional services" ///
label value ind_gr ind_gr

// smooth the proportions
mkspline s_w=wage, cubic nknots(5)
mlogit ind_gr s_w*
predict pr*

// create the graph
gen zero = 0
gen one = 1
gen l1 = pr1
gen l2 = pr1 + pr2
gen l3 = pr1 + pr2 + pr3
gen l4 = pr1 + pr2 + pr3 + pr4
gen l5 = pr1 + pr2 + pr3 + pr4 + pr5

sort wage
twoway rarea zero l1 wage || ///
rarea l1 l2 wage   || ///
rarea l2 l3 wage   || ///
rarea l3 l4 wage   || ///
rarea l4 l5 wage   || ///
rarea l5 one wage,    ///
///
5 "professional services"
///
4 "other services"
///
3 "finance"
///
///
1 "manual" ))
*---------------------- end example -----------------
(For more on how to use examples I sent to the Statalist,
see
http://home.fsw.vu.nl/m.buis/stata/exampleFAQ.html )

Hope this helps,
Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

Buitenveldertselaan 3 (Metropolitan), room N515

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```
```

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

```
```

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```