[D] compress -- Compress data in memory
compress [varlist] [, nocoalesce]
Data > Data utilities > Optimize variable storage
compress attempts to reduce the amount of memory used by your data.
nocoalesce specifies that compress not try to find duplicate values
within strL variables in an attempt to save memory. If nocoalesce is
not specified, compress must sort the data by each strL variable,
which can be time consuming in large datasets.
compress reduces the size of your dataset by considering two things.
First, it considers demoting
doubles to longs, ints, or bytes
floats to ints or bytes
longs to ints or bytes
ints to bytes
str#s to shorter str#s
strLs to str#s
See [D] data types for an explanation of these storage types.
Second, it considers coalescing strLs within each strL variable. That is
to say, if a strL variable takes on the same value in multiple
observations, compress can link those values to a single memory location
to save memory. To check for this, compress must sort the data on each
strL variable. You can use the nocoalesce option to tell compress not to
take the time to perform this check. If compress does check whether it
can coalesce strL values, it will do whichever saves more memory --
coalescing strL values or demoting a strL to a str# -- or it will do
nothing if it cannot save memory by changing a strL.
compress leaves your data logically unchanged but (probably) appreciably
smaller. compress never makes a mistake, results in loss of precision,
or hacks off strings.
. webuse compxmp2
How to optimize the storage of variables