For whatever reason, Stata manuals do not give you the inside scoop on
how each operating system handles memory. I wrote up a simple C code
that uses the C command malloc to calculate the maximum amount of
memory the operating system will give to a program. It is pretty clear
that Stata itself calls malloc in a very similar way, and does so
independently for memory, matsize and matvar.
Mac OS X - It can address up to around 3.5G per application, but it is
not all contiguous. The largest contiguous chunk you can get is 2.25G
(which you would give to memory), then 990M (probably to matsize), and
finally you can get a third chunk of about 200M (probably for maxvar).
Actually, there is program overhead, resulting in a maximum "set
memory" size of 2100M. Note that the Mac has a pretty advanced virtual
memory system, as you can address all of this memory regardless of the
physical memory in your system. If you are running a computationally
intensive batch job that doesn't swap data around too much, you
probably could load it all in and not get a large performance hit. I
ran some test jobs and they behaved themselves well in the background,
especially when I "nice" them from the command line. Right now I have
9G of virtual memory being used.
Stata itself takes all you can give it. Like I said, I was able to set
memory equal to 2100M with no problems.
I don't have Stata for Linux or Windows, so this is all speculation
based upon my C codes.
Windows XP (using the Cygwin gcc compiler) controls the amount of
virtual memory on the Advanced tab off of the System Properties dialog
box (right click on My Computer and select Properties). The maximum you
can set it at is 4096M. The FAQ on stata.com says that Windows loads
DLLs in randomly into memory, so even if you have 2GB of physical RAM,
you might be only able to use 1.4G for set memory in Stata. For my
tests using gcc, the most contiguous memory I could get was around
990M. After that I got chunks of 400M and 300M. The amount of physical
memory didn't affect those figures. If about 1G is the max data set
Windows XP can use, I would rate that as unacceptable. Furthermore,
news reports say Windows XP SP1 introduced memory handling bugs. I am
not impressed with Windows XP's memory handling capabilities.
Linux is almost as good as the Mac. There is a 3G per application limit
per application, of which 2G appears to be contiguous. So you can feed
2G to Stata for memory, and leave 1G for matsize and matvar. However,
you have to plan for this when you partition your drive when installing
Linux, as Linux uses a dedicated swap partition on your hard drive for
virtual memory. Linux installation programs will not recommend setting
aside 3 or 4G worth of hard drive space for swap, unless you have
around that amount in physical RAM. I am not sure how well Linux's
virtual memory works in practice: a few test jobs with large arrays
So, for desktop OS's memory handling abilities, I give the Mac a "B",
Linux a "C+", and Windows a "D".
I give Solaris an "A-", as you get a contiguous 4GB of memory in
32-bits and unlimited amounts of memory in 64 bits. Solaris doesn't get
a perfect score because, like Linux, it uses a separate swap hard drive
The conclusion to this is that the Mac wins if you have to work with
large datasets and cannot afford a Sun (or SGI or Alpha). Steer clear