Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: memory problem where over 50% of memory are free


From   "Sergiy Radyakin" <[email protected]>
To   <[email protected]>
Subject   st: Re: memory problem where over 50% of memory are free
Date   Tue, 10 Apr 2007 12:00:21 +0200

Hello Stephan,

1. Try to increase the memory size even further (if possible) and see if it helps.

2. Try to create a new int variable without egen, to see if it fits
(say all missings and cast it to int):
gen int maxyear2=.

3. If it does, then you can probably fill it in manually via a loop over
all observations. This is straightforward and guarantees that no extra
memory will be required, but may be rather slow.

4. -compress- your data. Even if you have tried -compress- already, it does not
mean that you can't improve it. Compress is rather stupid. It will take -long-
to store a dummy with two values if they are 1 and 1000. But obviously recoding
your data to 0 and 1 solves the problem (you can store those in byte). Compress
can't do that on it's own, -- you can. Notice, that your years are of type -int-.
I seriously doubt that you have data spanning more than 250 years or so, so
by subtracting the min year you can recode your data to smaller values of year
(you can label them afterwards if you wish), so your new variable will not be
"years AD", but rather "years after NNNN", where NNNN is the base year.
This will save you one byte-per-observation on year1, year2 and hence maxyear2.
Depending on how you use your data, it might be possible to compress it even
further. If you have about 20 years of data (20*12=240months) you can just store
"months since the base month" in one byte. (unfortunately, we can't have bit
storage types in Stata so far, though storing 8 dummies in one byte may be very
handy, but the need to store missings spoils all the fun)

5. Check if you can run your statement with just id and year2 and then merge-in (by id)
the other variables.

6. Perhaps other options exist too.

Best regards,
Sergiy



----- Original Message ----- From: "Stephan Brunow" <[email protected]>
To: <[email protected]>
Sent: Tuesday, April 10, 2007 11:10 AM
Subject: st: memory problem where over 50% of memory are free



Dear Statalisters,

I have a problem concerning the memory storage. There is a quiet large
dataset. If I use just 6 variables,


obs: 21,041,596
vars: 6
size: 336,665,536 (56.8% of memory free)
----------------------------------------------------------------------------
---
storage display value
variable name type format label variable label
----------------------------------------------------------------------------
---
persnr long %12.0g
year1 int %8.0g
month1 byte %8.0g
year2 int %8.0g
month2 byte %8.0g
util int %8.0g
----------------------------------------------------------------------------
---

I set the memory quiet large:


. memory
bytes
--------------------------------------------------------------------
Details of set memory usage
overhead (pointers) 84,166,384 10.79%
data 252,499,152 32.37%
----------------------------
data + overhead 336,665,536 43.15%
free 443,475,000 56.85%
----------------------------
Total allocated 780,140,536 100.00%
--------------------------------------------------------------------
Other memory usage
set maxvar usage 1,816,666
set matsize usage 1,315,200
programs, saved results, etc. 2,585
---------------
Total 3,134,451
-------------------------------------------------------
Grand total 783,274,987


At least, over 50% of allowed memory are free. There should be enought place
to generate 2 or 3 integer variables. However, if I do the following I
recieve the error message that there is no room to add a variable due to
width. I can wheter compress the data nor drop variables since it is
compressed and I need these 6 variables.

Here is the command:

. by persnr, sort: egen int maxyear2=max(year2)

What might be the problem, what should I do?

With many thanks,

Stephan

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index