Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: memory required for -merge-


From   Fred Wolfe <fwolfe@arthritis-research.org>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: memory required for -merge-
Date   Thu, 09 Mar 2006 05:56:38 -0600

It seems possible that if there is no match at all that the data set would be much larger than you expect. One way to examine what's happening would be to merge using only part of the data set.

Fred

At 09:39 PM 3/8/2006, you wrote:

Hi all,

Is there a rule for the amount of memory required for -merge-? I keep
getting a "no room to add more variables due to width" error on the
same merge. I am attempting to -merge- on only one variable
(placefip), and 67% of memory is free before I attempt the -merge-
(i.e., while the "master" dataset is loaded). The "using dataset is
small, and I've got the memory set quite high (900m). So, I can't
understand why I am running into memory problems.

This is the info on the "master" dataset:

. des

Contains data from ../data/geog/geog_pooled.dta
obs: 20,873,141
vars: 4 8 Mar 2006 21:32
size: 333,970,256 (66.5% of memory free) (_dta has notes)
------------------------------------------------------------------------ -------
storage display value
variable name type format label variable label
------------------------------------------------------------------------ -------
cityres long %05.0f * City of residence
msarfip int %04.0f * PMSA/MSA of residence
(FIPS)
year int %4.0g
placefip long %07.0f * Place (city) of
residence (FIPS)
* indicated variables
have notes
------------------------------------------------------------------------ -------

And this is the info on the "using" dataset:

. des using ../data/geog/geog_cc

Contains data
obs: 25,150 8 Mar 2006 22:21
vars: 3
size: 1,207,200
------------------------------------------------------------------------ -------
storage display value
variable name type format label variable label
------------------------------------------------------------------------ -------
macci byte %1.0g Center city
placefip long %07.0f
place_name str39 %39s
------------------------------------------------------------------------ -------
Sorted by: placefip

This is the memory usage I see after the error, when the "master"
dataset is loaded:

. memory
bytes
--------------------------------------------------------------------
Details of set memory usage
overhead (pointers) 83,492,564 8.38%
data 250,477,692 25.14%
----------------------------
data + overhead 333,970,256 33.53%
free 662,176,944 66.47%
----------------------------
Total allocated 996,147,200 100.00%
--------------------------------------------------------------------
Other memory usage
set maxvar usage 1,848,666
set matsize usage 1,315,200
programs, saved results, etc. 41,128
---------------
Total 3,204,994
-------------------------------------------------------
Grand total 999,352,194

Any ideas?

--
Danielle H Ferry


*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

Fred Wolfe
National Data Bank for Rheumatic Diseases
Wichita, Kansas
Tel (316) 263-2125     Fax (316) 263-0761
fwolfe@arthritis-research.org


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2020 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index