Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: contract command

From   "Nick Cox" <>
To   <>
Subject   st: RE: contract command
Date   Wed, 26 Jun 2002 11:49:58 +0100


> I am using the -contract- command to make a smaller version of a very very
> large file on a unix system.  The file is so large that I may need to use
> virtual memory. The large file is also sorted by most of the variables
> that I will -contract- on, hence, cases with the same values are clustered
> together.  My question is whether there is a way to make -contract- take
> advantage of this clustering.  I anticipate that if this is possible only
> one pass of the data will be needed, whereas if it is not possible, I am
> not sure how many passes will be needed.  As the file is over 7GB,
> contains more than 10 million cases, and may necessitate the use of
> virtual memory, any such savings would be substantial. Any assistance is
> greatly appreciated.

-contract- is really quite a simple command. To
understand this and any other answers better,  you
should type

which contract

to find out where contract.ado is on your system
and then use a text editor (Stata's own -doedit-
will do fine) to look at the code.


1. At the heart of -contract- is a -sort- on
the varlist supplied, and to the extent that
the data are already sorted, that will go faster,
but I doubt that menory use is affected.

2. There aren't any special tweakable options
to -contract- to affect memory use.

3. If you didn't need some of the features
of -contract-, you could write your own
slimmed down version, but my guess is that
the effect on memory will be slight.

To say more would, I guess, need more knowledge of
Stata's handling of memory with very large
files than I possess.


*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index