[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <[email protected]> |

To |
<[email protected]> |

Subject |
st: RE: Number cases into groups based on a shared value |

Date |
Mon, 14 Mar 2005 18:39:01 -0000 |

-egen, group()- is a wrapper around this main idea: bysort SomeNum : gen GroupNum = _n == 1 replace GroupNum = sum(GroupNum) I have forgotten all the SPSS syntax I ever knew, which was very little and a long time ago, so I can't translate the other way. And -by:- is pretty Stataish. It may not be very translatable. In more words, 0. -sort-ing on SomeNum is needed. (-egen- does that quietly, if needed, and then undoes it. With DIY, you must DIY.) You see that. 1. Once you have SomeNum 10 10 ... 11 11 ... ... 16 16 ... then you just assign 1 to the first in each block with a 1 and assign 0 to the others: SomeNum GroupNum 10 1 10 0 ... 11 1 11 0 ... ... 16 1 16 0 ... 2. Finally, what you want is the cumulative sum, given by -sum()-. Another way to do it is sort SomeNum gen GroupNum = _n == 1 replace GroupNum = cond(SomeNum != SomeNum[_n-1], GroupNum[_n-1] + 1, GroupNum[_n-1]) in 2/l which is closer in spirit to the code you have, but not the approved way to do this. Nick [email protected] Mike Lacy > I'm wanting to learn about a "do it yourself" way to do what is > accomplished by the -group- function in the -egen- command in > the following: > > set obs 100 > gen SomeNum = 10 + int(7 * uniform()) > * Attach a sequential group number to all the > * cases with the same value for "SomeNum" > egen GroupNum = group(SomeNum) > > > This works fine at accomplishing the task. My interest in > the DIY approach > is that the kind of algorithm I am accustomed to using for > this task does > not fit with the inner nature <grin> of Stata. I'm > accustomed (in SPSS or > lower level languages) something like: > > sort SomeNum > gen MyGroup = 1 if _n ==1 > gen Same = (Somenum = Somenum[_n-1]) > gen MyGroup = MyGroup[_n-1] if Same > gen MyGroup = 1+ MyGroup[_n-1] if ! Same > > This doesn't fit with how Stat does -if-, as near as I > understand. So, what would the Stata DIY approach to this > kind of algorithm > be? All I could come up with was to put SomeNum into a > matrix so that I > could loop through it, but that hardly seems like a desirable > way to do things. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: Number cases into groups based on a shared value** - Next by Date:
**st: Clustering with Nlogit** - Previous by thread:
**st: Number cases into groups based on a shared value** - Next by thread:
**st: Clustering with Nlogit** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |