How do I load large datasets (>1 GB) under 32-bit Windows? I receive an
error r(909) saying “op. sys. refuses to provide memory”.
|
Title
|
|
Large datasets under Windows
|
|
Author
|
Kevin S. Turner, StataCorp
|
|
Date
|
October 2001; updated August 2010; minor revisions October 2012
|
First, make sure you have installed enough memory or allowed for enough
virtual memory. If you have and are still getting this error, continue
reading.
Under all current 32-bit Windows operating systems (Windows 8, 7, Vista, XP,
2000, NT, ME, 98, 95), the total available address space for any application
is 2.1 GB. If you have a dataset larger than 2.1 GB, you will not be able to
load it on Stata for Windows. This is simply a limitation of the operating
system.
Unfortunately, even if your dataset is under the 2.1-GB limit, you may run
into difficulty when loading it into Stata. The fault again lies with how
Windows manages the 2.1-GB address space. When a typical application loads,
there are usually several libraries (or DLLs) that are loaded as well.
These libraries are usually loaded into the 2.1-GB space on the upper end
but not in any deterministic order. Microsoft has assured us that there is no way
to prevent these libraries from loading into arbitrary addresses; thus,
fragmenting the available space. When Stata tries to load a dataset, it
requests from Windows the largest contiguous space in the 2.1-GB
range. Depending on where Windows loaded the initial libraries, this may be
1.8 GB, 1.3 GB, or even less. You may be surprised to find that a 1.4-GB
dataset loaded fine one time but failed to load later. This is
simply an unfortunate side effect of Windows memory management.
As of Stata 11.1, some of the dependencies on external DLLs were removed,
reducing memory fragmentation and increasing the amount of memory
available to Stata. If you are using 32-bit Windows XP and you are still
having trouble allocating memory, you should read
“Memory
allocation in Windows XP”.
By now, you are wondering what your alternatives are.
Since July 2007, several operating system alternatives with 64-bit
support have become available. See
our
list of operating systems compatible with Stata. The 64-bit platform will
enable you to work with large datasets. Depending on your operating
system, you should be able to allocate as much memory as you have on the
machine, minus the system requirements. To take advantage of this
technology, you will need
64-bit–compatible hardware, a 64-bit operating system, and, of course, a
64-bit version of Stata.
As a last resort, you may consider trimming any unnecessary data from your
dataset or dividing the dataset into two files. You may want to use the
second syntax of the use command to read in just the
observations/variables you want. For example:
. describe using auto.dta
Contains data 1978 Automobile Data
obs: 74 26 Mar 2007 09:52
vars: 12
size: 3,478
-------------------------------------------------------------------------------
storage display value
variable name type format label variable label
-------------------------------------------------------------------------------
make str18 %-18s Make and Model
price int %8.0gc Price
mpg int %8.0g Mileage (mpg)
rep78 int %8.0g Repair Record 1978
headroom float %6.1f Headroom (in.)
trunk int %8.0g Trunk space (cu. ft.)
weight int %8.0gc Weight (lbs.)
length int %8.0g Length (in.)
turn int %8.0g Turn Circle (ft.)
displacement int %8.0g Displacement (cu. in.)
gear_ratio float %6.2f Gear Ratio
foreign byte %8.0g origin Car type
-------------------------------------------------------------------------------
Sorted by: foreign
. use mpg price for using auto.dta in 1/50, clear
(1978 Automobile Data)
. describe
Contains data from auto.dta
obs: 50 1978 Automobile Data
vars: 3 26 Mar 2007 09:52
size: 450 (99.9% of memory free)
-------------------------------------------------------------------------------
storage display value
variable name type format label variable label
-------------------------------------------------------------------------------
price int %8.0gc Price
mpg int %8.0g Mileage (mpg)
foreign byte %8.0g origin Car type
-------------------------------------------------------------------------------
Sorted by: foreign
Depending on your data and analysis, this may not be feasible and is
offered only as a suggestion.
|
FAQs
What's new?
Statistics
Data management
Graphics
Programming Stata
Mata
Resources
Internet capabilities
Stata for Windows
Stata for Unix
Stata for Mac
Technical support
|