st: RE: Number cases into groups based on a shared value

 From "Nick Cox" To Subject st: RE: Number cases into groups based on a shared value Date Mon, 14 Mar 2005 18:39:01 -0000

```-egen, group()- is a wrapper around this
main idea:

bysort SomeNum : gen GroupNum = _n == 1
replace GroupNum = sum(GroupNum)

I have forgotten all the SPSS syntax
I ever knew, which was very little and a
long time ago, so I can't translate the
other way. And -by:- is pretty Stataish.
It may not be very translatable.

In more words,

0. -sort-ing on SomeNum is needed. (-egen-
does that quietly, if needed, and then undoes
it. With DIY, you must DIY.) You see that.

1. Once you have

10
10
...
11
11
...

...
16
16
...

then you just assign 1 to the first in
each block with a 1 and assign
0 to the others:

10                1
10                0
...
11                1
11                0
...

...
16                1
16                0
...

2. Finally, what you want is the
cumulative sum, given by -sum()-.

Another way to do it is

gen GroupNum = _n == 1
replace GroupNum =
in 2/l

which is closer in spirit to the code you have, but not the
approved way to do this.

Nick
n.j.cox@durham.ac.uk

Mike Lacy

> I'm wanting to learn about a "do it yourself" way to do what is
> accomplished by the -group- function in the -egen- command in
> the following:
>
> set obs 100
> gen SomeNum = 10 + int(7 * uniform())
> * Attach a sequential group number to all the
> * cases with the same value for "SomeNum"
>
>
> This works fine at accomplishing the task.  My interest in
> the DIY approach
> is that the kind of algorithm I am accustomed  to using for
> not fit with the inner nature <grin> of Stata.  I'm
> accustomed (in SPSS or
> lower level languages) something like:
>
> gen MyGroup = 1 if _n ==1
> gen MyGroup = MyGroup[_n-1] if Same
> gen MyGroup = 1+ MyGroup[_n-1] if ! Same
>
> This doesn't fit with how Stat does -if-, as near as I
> understand. So, what would the Stata DIY approach to this
> kind of algorithm
> be?  All I could come up with was to put SomeNum into a
> matrix so that I
> could loop through it, but that hardly seems like a desirable
> way to do things.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```