Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Sorting huge Datasets

From   Stas Kolenikov <>
Subject   Re: st: RE: Sorting huge Datasets
Date   Thu, 4 Nov 2004 15:55:56 -0500

I also remember somebody posting the comparisons in the time it takes
to sort a large file under different scenarios, and the bottomline was
that if it is already sorted, -sort- is very quick, so if you need to
verify whether the data are sorted, and resort them if they are not,
you can just do -sort- on them without major sacrifice in computer

If you suspect any particular command, you can check whether it has
-sortpreserve- in the -program- statement at the top of the ado-file.
It might provide a little bit faster sorting through Stata Corp's
internal code, rather than through the -sort- command that you would
have to issue otherwise.

Other than that, you cannot do much with the sorting, and it can be
quite time consuming indeed :((.


On Thu, 4 Nov 2004 14:54:25 -0000, Nick Cox <> wrote:
> help sort
> -sort- doesn't sort if the observations
> are already sorted according to the order
> specified.
> help missing
> Nick
> Barbara Hofmann
> > are there any commands which change the actual sort sequence
> > of a dataset?
> >
> > We got the problem of having some big datasets that must to
> > be sorted every time by the same variables before executing
> > commands on them. It takes up a lot of time to sort them, so
> > we got two questions.
> > First, are there certain commands that change the sort
> > sequence and others who keep it? And second, is there any
> > command (e.g. assert) in order to check the sort sequence and
> > only in case of being unsorted sorts the data?
> >
> > ps: what is the intern value of a missing?
> *
> *   For searches and help try:
> *
> *
> *

Stas Kolenikov
*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index