[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Memory during a merge

From	Fred Wolfe <[email protected]>
To	[email protected], [email protected]
Subject	Re: st: Memory during a merge
Date	Fri, 03 Nov 2006 13:47:27 -0600

Thank you, that's just what I wanted to know.

Fred

At 01:19 PM 11/3/2006, William Gould, Stata wrote:

Fred Wolfe <[email protected]> replied to my desciption of
how -merge, nokeep()- works,

> So if I understand you correctly, the observations are brought in and then
> the non keep variables are deleted? If that is the case the maximum memory
> use with keep() would be no different than from the memory use without keep.

No.

In Stata, you think of bringing in the entire dataset. Inside the C
code, we have the ability to bring in the dataset an observation at a time,
and to throw those observations away as we go.

Consider a dataset of 10,000 observations, each 5,000 bytes long. The dataset
is then 10,000x5,000 = 50,000,000 bytes. Let's say that's the dataset on disk
when we -merge- and that we -keep()- only one of the variables: a 2-byte one.

We do not bring in the 50,000,000 bytes and then reduce that to 2*10,000 =
20,000 bytes.

We bring in 5,000 bytes. We then decide whether that observation merges.
If it does, we copy 2 bytes from the 5,000 bytes to the appropriate place, and
then throw the 5,000 bytes away. Then we do it again. And again.

Fred mentioned -merge-'s -keep()- option (keep a subset of the variables), but
he did not mention the -nokeep- option (keep only observations that merge).
Perhaps Fred wants to specify both options.

-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/


Fred Wolfe
National Data Bank for Rheumatic Diseases
Wichita, Kansas
Tel +1 316 263 2125
[email protected]


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: st: Memory during a merge
  - From: [email protected] (William Gould, Stata)

Prev by Date: st: RE: looping over bys groups?
Next by Date: st: A question about fixed-effect regression
Previous by thread: Re: st: Memory during a merge
Next by thread: st: RE: how to generate a variable using the estimation results
Index(es):
- Date
- Thread