# st: RE: Saving percentage distribution

 From "Nick Cox" To Subject st: RE: Saving percentage distribution Date Tue, 3 Dec 2002 18:34:53 -0000

```Zun
>
> I have two vars ind (52 categories) and occ (7 categories),
> and I want
> the percentage distribution of ind for each category of
> occ. Note that
> not each ind category has cases. For instance:
>
> Occ=1
> ind     pct
> 1       .0309522
> 2       .0334331
> 3	0
> 4	.0356777
> 5       .3402772
> 6       .0294558
> .       .
> .       .
> 52      .3151532
>
> Occ=2
> ind     pct
> 1       .0036623
> 2       .0006301
> 3	0
> 4       .0064976
> 5	0
> 6       .0455619
> .       .
> .       .
> 52      .0953769
>
> As shown above, ind=3 is not in both occ=1 and occ=2 while
> ind=5 is in
> occ=1 but not in occ=2.
>
> My questions are:
>
> First, if I use tabulate to get the percentage distribution of any
> categorical variable, how can I save the percentages in a
> new dataset
> that looks like one of the tables above.
>
> Second, in the specific example above, is there a way I can
> create a new
> dataset that looks like this:
>
> ind     pctocc1         pctocc2
> 1       .0309522        .0036623
> 2       .0334331        .0006301
> 3       0               0
> 4       .0356777        .0064976
> 5       .3402772        0
> 6       .0294558        .0455619
> .       .               .
> .       .               .
> 52      .3151532        .0953769
>

I guess that you have at most 52 * 7 observations.
Forget -tabulate-: a direct calculation is better.

Typing

. findit percent

does point to lots of things; but one pertinent is -egen-.

. bysort occ : egen pctocc = pc(ind)

followed by a -reshape- may help. You may need
to -replace- any missings by 0.

Nick
n.j.cox@durham.ac.uk
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```