Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Memory Truths by OS


From   Jeremy Fox <jeremytfox@mac.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: Memory Truths by OS
Date   Thu, 3 Apr 2003 04:11:55 -0800

For whatever reason, Stata manuals do not give you the inside scoop on how each operating system handles memory. I wrote up a simple C code that uses the C command malloc to calculate the maximum amount of memory the operating system will give to a program. It is pretty clear that Stata itself calls malloc in a very similar way, and does so independently for memory, matsize and matvar.

Mac OS X - It can address up to around 3.5G per application, but it is not all contiguous. The largest contiguous chunk you can get is 2.25G (which you would give to memory), then 990M (probably to matsize), and finally you can get a third chunk of about 200M (probably for maxvar). Actually, there is program overhead, resulting in a maximum "set memory" size of 2100M. Note that the Mac has a pretty advanced virtual memory system, as you can address all of this memory regardless of the physical memory in your system. If you are running a computationally intensive batch job that doesn't swap data around too much, you probably could load it all in and not get a large performance hit. I ran some test jobs and they behaved themselves well in the background, especially when I "nice" them from the command line. Right now I have 9G of virtual memory being used.

Stata itself takes all you can give it. Like I said, I was able to set memory equal to 2100M with no problems.

I don't have Stata for Linux or Windows, so this is all speculation based upon my C codes.

Windows XP (using the Cygwin gcc compiler) controls the amount of virtual memory on the Advanced tab off of the System Properties dialog box (right click on My Computer and select Properties). The maximum you can set it at is 4096M. The FAQ on stata.com says that Windows loads DLLs in randomly into memory, so even if you have 2GB of physical RAM, you might be only able to use 1.4G for set memory in Stata. For my tests using gcc, the most contiguous memory I could get was around 990M. After that I got chunks of 400M and 300M. The amount of physical memory didn't affect those figures. If about 1G is the max data set Windows XP can use, I would rate that as unacceptable. Furthermore, news reports say Windows XP SP1 introduced memory handling bugs. I am not impressed with Windows XP's memory handling capabilities.

Linux is almost as good as the Mac. There is a 3G per application limit per application, of which 2G appears to be contiguous. So you can feed 2G to Stata for memory, and leave 1G for matsize and matvar. However, you have to plan for this when you partition your drive when installing Linux, as Linux uses a dedicated swap partition on your hard drive for virtual memory. Linux installation programs will not recommend setting aside 3 or 4G worth of hard drive space for swap, unless you have around that amount in physical RAM. I am not sure how well Linux's virtual memory works in practice: a few test jobs with large arrays crashed.

So, for desktop OS's memory handling abilities, I give the Mac a "B", Linux a "C+", and Windows a "D".

I give Solaris an "A-", as you get a contiguous 4GB of memory in 32-bits and unlimited amounts of memory in 64 bits. Solaris doesn't get a perfect score because, like Linux, it uses a separate swap hard drive partition.

The conclusion to this is that the Mac wins if you have to work with large datasets and cannot afford a Sun (or SGI or Alpha). Steer clear of Windows.

Jeremy

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index