# Re: st: how to make an area graph showing distribution?

 From Maarten buis To statalist@hsphsun2.harvard.edu Subject Re: st: how to make an area graph showing distribution? Date Sun, 30 Nov 2008 15:37:09 +0000 (GMT)

```--- Gisella Young <gisellayoung@yahoo.com> wrote:
> On Maarten Buis's suggestion, I am not sure why I would really need
> a regression - I get from his email that this is basically for
> smoothing?

Yes, as income in the example dataset (and I assume in your dataset as
well) is a continuous variable, there just aren't enough cases for each
income value to estimate the proportions.

> Since I actually want to plot the actual data (but realise
> that this needs smoothing),

You have to choose one or the other, and if you choose to smooth than
my use of -mlogit- is probably the easiest method that will ensure that
the smoothed proportions will add up to one.

> what I would prefer to do would be to have income plotted in for
> example percentiles (on the x-axis) showing for each percentile the
> composition of that income percentile in terms of occupation on the
> y-axis.

If I understand you correctly, all that is different from my example is
that you want to do that on a transformed metric of income. The way to
do the percentile rank transformation is discussed here:
http://www.stata.com/support/faqs/stat/pcrank.html

This has been implemented in the example below (and just because I felt
like it, I replaced the legend with a second y-axis)

*--------------- begin example -----------------------
// prepare the example data
sysuse nlsw88, clear
gen ind_gr = industry
recode ind_gr 1/5=1 6=2 7=3 8/10=4 11=5 12=6
label define ind_gr 1 "manual"                ///
3 "finance"               ///
4 "other services"        ///
5 "professional services" ///
label value ind_gr ind_gr

// compute percentile ranks
egen n = count(wage)
egen i = rank(wage)
gen hazen = (i - 0.5) / n * 100
label variable hazen "percentile rank of income"

// smooth the proportions
mkspline s_w=hazen, cubic nknots(5)
mlogit ind_gr s_w*
predict pr*

// create the graph
gen zero = 0
gen one = 100
gen l1 = (pr1)*100
gen l2 = (pr1 + pr2)*100
gen l3 = (pr1 + pr2 + pr3)*100
gen l4 = (pr1 + pr2 + pr3 + pr4)*100
gen l5 = (pr1 + pr2 + pr3 + pr4 + pr5)*100

sort hazen

// collect the labels for the second y-axis
local mid = l1[_N]/2
local yaxis `"`mid' "manual""'

local mid = (l2[_N]-l1[_N])/2 + l1[_N]

local mid = (l3[_N]-l2[_N])/2 + l2[_N]
local yaxis `"`yaxis' `mid' "finance""'

local mid = (l4[_N]-l3[_N])/2 + l3[_N]
local yaxis `"`yaxis' `mid' "other services""'

local mid = (l5[_N]-l4[_N])/2 + l4[_N]
local yaxis `"`yaxis' `mid' "professional services""'

local mid = (100-l5[_N])/2 + l5[_N]
local yaxis `"`yaxis' `mid' "public administration""'

twoway rarea zero l1 hazen, yaxis(1) || ///
rarea l1 l2 hazen, yaxis(2)   || ///
rarea l2 l3 hazen   ||           ///
rarea l3 l4 hazen   ||           ///
rarea l4 l5 hazen   ||           ///
rarea l5 one hazen,              ///
ytitle("percentage")             ///
ylab(`yaxis', axis(2))           ///
yscale(range(0 100) axis(1))     ///
yscale(range(0 100) axis(2))     ///
ytitle("", axis(2))              ///
plotregion(margin(zero))         ///
aspect(1)                        ///
legend(off)
*---------------------- end example -----------------
(For more on how to use examples I sent to the Statalist, see
http://home.fsw.vu.nl/m.buis/stata/exampleFAQ.html )

Hope this helps,
Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

Buitenveldertselaan 3 (Metropolitan), room N515

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```