Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Increase observations by group and then calculate percentages |

Date |
Thu, 3 Nov 2011 08:01:39 +0000 |

I've got to say that I think this is a bad idea. It's a spreadsheet idea imported into a statistical software context. Group characteristics are best stored as group characteristics. If you do this, then you have to remember to exclude the observations with summaries from every analysis thereafter. It seems unlikely that you would really prefer to do that. -egen- in conjunction with -by:- or -by()- is a handy tool to generate group summaries. There is also a -tag()- function for a common need to look at each group summary just once. The main trick to do what you want is to use -expand-, as a thread started a few hours ago by Fernando Luco does show. Nick On Thu, Nov 3, 2011 at 2:14 AM, Catherine Tisch <catherine.tisch@canterbury.ac.nz> wrote: > Hi all > > I'm hoping someone can help me with my problem - I'd like to insert an > observation after every group and then create percentages by group in my > panel dataset. > > My dataset looks something like this: > > AreaID Ethnicity Sex1Age1 Sex2Age1 Sex1Age2 > Sex2Age2 and so on.... > 1 GroupX 18 1 > 1 Total 57 21 > 2 GroupX 33 81 > 2 Total 528 147 > and so on.... > > Ethnicity is stored as a string, all other variables are stored as > float. I have approximately 2000 AreaIDs and 16 Sex/Age groups. I am > using Stata 11.1. > > I'm hoping my final dataset might look something like this: > > AreaID Ethnicity Sex1Age1 Sex2Age1 Sex1Age2 > Sex2Age2 and so on.... > 1 GroupX 18 16 > 1 Total 57 21 > 1 PercentX 31.6 76.2 > 2 GroupX 33 81 > 2 Total 528 147 > 2 PercentX 6.3 55.1 > and so on..... > > Where PercentX is GroupX/Total for each AreaID by the Sex/Age variables. > > I think the first stage I need to do is increase the number of > observations in my dataset by using -set obs-. I tried > > set obs `=_N+1', by AreaID > > but got the error message 'options not allowed'. How can I get around > this? > > Once I get that sorted, the second stage is calculating percentages and > I think I need to use -by- function and any suggestions would be most > welcome. > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Increase observations by group and then calculate percentages***From:*Catherine Tisch <catherine.tisch@canterbury.ac.nz>

- Prev by Date:
**Re: st: Insert new observation by group** - Next by Date:
**Re: st: prvalue and loop rest(grmean)** - Previous by thread:
**st: Increase observations by group and then calculate percentages** - Next by thread:
**Re: st: Increase observations by group and then calculate percentages** - Index(es):