[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Thanks Re: st: RE: contract command
Thanks for your helpful response.
On Wed, 26 Jun 2002, Nick Cox wrote:
> > I am using the -contract- command to make a smaller version of a very very
> > large file on a unix system. The file is so large that I may need to use
> > virtual memory. The large file is also sorted by most of the variables
> > that I will -contract- on, hence, cases with the same values are clustered
> > together. My question is whether there is a way to make -contract- take
> > advantage of this clustering. I anticipate that if this is possible only
> > one pass of the data will be needed, whereas if it is not possible, I am
> > not sure how many passes will be needed. As the file is over 7GB,
> > contains more than 10 million cases, and may necessitate the use of
> > virtual memory, any such savings would be substantial. Any assistance is
> > greatly appreciated.
> -contract- is really quite a simple command. To
> understand this and any other answers better, you
> should type
> which contract
> to find out where contract.ado is on your system
> and then use a text editor (Stata's own -doedit-
> will do fine) to look at the code.
> 1. At the heart of -contract- is a -sort- on
> the varlist supplied, and to the extent that
> the data are already sorted, that will go faster,
> but I doubt that menory use is affected.
> 2. There aren't any special tweakable options
> to -contract- to affect memory use.
> 3. If you didn't need some of the features
> of -contract-, you could write your own
> slimmed down version, but my guess is that
> the effect on memory will be slight.
> To say more would, I guess, need more knowledge of
> Stata's handling of memory with very large
> files than I possess.
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: