Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

re: st: Reading large data sets in Stata

From   Christopher Baum <>
Subject   re: st: Reading large data sets in Stata
Date   Mon, 22 Feb 2010 15:19:07 -0500

Stas said

That's 13Gb of data, right? If you really want to put everything into memory, then you would probably need a computer with 24Gb of RAM. I don't really know if you can buy anything like that in the desktop format, and what kind of OS you would need to look at, although I am sure there are clusters with much larger memory capacities. If you only need subsets of that data set, then you could use <list of the variables that you REALLY need> if <subsetting to the conditions you REALLY want to analyze> using <this huge data set name> That way, you may have a data set of a more realistic 2Gb size that you can work with on a 4Gb RAM machine.

That's not necessarily 13 Gb of data. Using the interactive calculator on the FAQ, if you assume all 37 variables can be held in 4 bytes each, it's under 7 Gb. If on average they only need 3 bytes each, it's under 6 Gb. Stat/Transfer can optimize the dataset as it converts it to Stata format. Stas' suggestions are well taken, but one more is important--if any of these variables are 0/1 indicators, or integers taking on values 1..5, etc. they need not chew up nearly as much memory. I don't know if you can get it down to a 2 Gb size, though. To use more than 2 Gb, you need a 64-bit machine (almost all machines are these days), and Stata 11 will automatically install the 64-bit version on such a machine.

Kit Baum   |   Boston College Economics and DIW Berlin   |
An Introduction to Stata Programming   |
An Introduction to Modern Econometrics Using Stata   |

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index