Thanks to Kit Baum there is an updated version of the package -gzsave-
available from the SSC repository. The update has corrected the filename
handling, so that after opening a dataset with gzuse, the filename of
the compressed file is now reported in for example the -describe-
command, and not the temporary file name created during the
decompression process. Thanks goes to my collegaue Morten Andersen for
pointing this out.
From the description of the package:
*Description*
gzsave stores the dataset currently in memory on disk under the name
filename. If filename is specified without an extension, .dta.gz is used.
gzuse loads a Stata-format dataset previously saved by gzsave into
memory. If filename is specified without an extension, .dta.gz is
assumed.
*Remarks*
These commands are useful for two purposes:
First, they obviously help lowering the space used on disk by a dataset,
which may be important when storing very large datasets.
Second, they may help reduce network load when using a distributed disk
system such as NFS. This is due to the fact that the commands only
transfer the compressed datasets over the network, since the
uncompressed dataset is only stored as a temporay datafile, which
typically resides on the local disk (where local is relative to the
running instance of Stata).
The price paid for saving disk space (and network load) is the CPU time
used by gzip - please, test for yourself whether compression is actually
advantageous in your specific set-up.