Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Increase observations by group and then calculate percentages


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Increase observations by group and then calculate percentages
Date   Thu, 3 Nov 2011 08:01:39 +0000

I've got to say that I think this is a bad idea. It's a spreadsheet
idea imported into a statistical software context. Group
characteristics are best stored as group characteristics. If you do
this, then you have to remember to exclude the observations with
summaries from every analysis thereafter. It seems unlikely that you
would really prefer to do that.

-egen- in conjunction with -by:- or -by()- is a handy tool to generate
group summaries. There is also a -tag()- function for a common need to
look at each group summary just once.

The main trick to do what you want is to use -expand-, as a thread
started a few hours ago by Fernando Luco does show.

Nick

On Thu, Nov 3, 2011 at 2:14 AM, Catherine Tisch
<catherine.tisch@canterbury.ac.nz> wrote:
> Hi all
>
> I'm hoping someone can help me with my problem - I'd like to insert an
> observation after every group and then create percentages by group in my
> panel dataset.
>
> My dataset looks something like this:
>
> AreaID  Ethnicity       Sex1Age1        Sex2Age1        Sex1Age2
> Sex2Age2        and so on....
> 1       GroupX  18              1
> 1       Total           57              21
> 2       GroupX  33              81
> 2       Total           528             147
> and so on....
>
> Ethnicity is stored as a string, all other variables are stored as
> float.  I have approximately 2000 AreaIDs and 16 Sex/Age groups.  I am
> using Stata 11.1.
>
> I'm hoping my final dataset might look something like this:
>
> AreaID  Ethnicity       Sex1Age1        Sex2Age1        Sex1Age2
> Sex2Age2        and so on....
> 1       GroupX  18              16
> 1       Total           57              21
> 1       PercentX        31.6            76.2
> 2       GroupX  33              81
> 2       Total           528             147
> 2       PercentX        6.3             55.1
> and so on.....
>
> Where PercentX is GroupX/Total for each AreaID by the Sex/Age variables.
>
> I think the first stage I need to do is increase the number of
> observations in my dataset by using -set obs-.  I tried
>
> set obs `=_N+1', by AreaID
>
> but got the error message 'options not allowed'.  How can I get around
> this?
>
> Once I get that sorted, the second stage is calculating percentages and
> I think I need to use -by- function and any suggestions would be most
> welcome.
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index