# st: RE: is ordering with -bysort- unique?

 From "Nick Cox" To Subject st: RE: is ordering with -bysort- unique? Date Tue, 10 Jun 2003 22:54:10 +0100

```Radu Ban

> i'm cleaning a dataset and i encounter repeated ids. i want
> to keep them
> unique, but the problem is that for some repeated ids the
> variables differ.
> i want to keep just one of the repeated ids. so i'm using:
>
> bysort id: keep if _n == 1
>
> now i would like to know if this will keep the same id
> whenever the program
> is run. or does the ordering change?

The same -id-s will remain in the dataset. The real
issue, as you know, is what happens to values of other variables.

Suppose Stata -sort-s on -id-:

1
1
2
2
3
3
3

Suppose it did it a different way:

1
1
2
2
3
3
3

In terms of -id- alone, the answer is the same. Stata, and you, are
both
indifferent to which of these solutions (of the 2! 2! 3! = 24
possibilities)
is preferred.

Can you tell the difference? The answer is clearly no. When
you

bysort id : keep if _n == 1

the answer is, again, the same as far as you are concerned,
in terms of -id-,

id
1
2
3

Now suppose you have other variables:

1    pat
1    jean marie
2    lisa
2    teresa
3    eva marie
3    monica
3    marsha

As you know, the result after -bysort id: keep if _n == 1- could be

1    pat
2    lisa
3    eva marie

or it could be

1    jean marie
2    teresa
3    monica

and indeed any one of the other 24 possibilities.

Will the answer be the same? In general, I doubt it.
At least some of the time Stata appears to randomize
the order a little before -sort-ing, although I can't
remember why I think I know that; anyway, I doubt that the answer
is reproducible. I wouldn't depend on it.

Nick
n.j.cox@durham.ac.uk

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```