Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Sorting


From   Doug Owens <[email protected]>
To   [email protected]
Subject   st: Sorting
Date   Mon, 14 Mar 2005 15:38:35 -0500

How does Stata's sort algorithm work?
I was looking at the stability of the sort order. See the following 2
examples. But first note that I am not using the stable option in the
sort command.

Example 1 (changing sort order):
clear
set obs 100
g x = _n
g herbal = _n>25 & _n<=75
sort grade
l x in 1/10, clean noo

Example 2 (stable sort order):
clear
set obs 100
g x = _n
g herbal = _n<51
sort herbal
assert x ==_N+1-_n

Example 1 is what I've come to expect from experience. Each time it is
run the data is sorted differently.

The stability preservation of example 2 was surprising.  Quoting from
the Stata manual entry on sort, "Without the stable option, the
ordering of observations with equal values of varlist is randomized."
I though that if the data was such that "herbal>=herbal[_n-1] if
_n>1", then a "sort herbal" command would not need to change the order
of the data, and thus the resulting sort order would not not vary with
multiple executions of the code.  But that is not the case here.  What
other conditions can lead to a (non-unique) sort command producing the
same dataset each time?
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index