[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Re: memory size 7 data merging

From	Joseph Coveney <[email protected]>
To	Statalist <[email protected]>
Subject	Re: st: Re: memory size 7 data merging
Date	Mon, 24 Sep 2007 21:44:46 -0700

Michael Blasnik wrote:

The real question is -- what do you want as a result of the merge?  Do you
want 1,300,000 observations or do you just want 100,000 observations with
matched info from the larger file.  If the latter, then use the -nokeep-
option for _merge and you should be OK.  But if you want the former, then it
seems like the resulting dataset won't fit in your allocatable Stata memory
and you will need to figure out how to make it smaller by encoding strings,
dropping variables, etc.

--------------------------------------------------------------------------------

Good suggestions.

With -nokeep-, would you need to sacrifice any -assert _merge == 3-
and -assert inlist(_merge, 1, 3)-?  I would dread not being able to take
advantage of -assert- after a -merge- with the datasets that I get handed.

Also -nokeep- might still not be enough if there are multiple observations
in the 1.3-million-observation (40-megabyte) dataset that match each
observation in the 100000-observation (53-megabyte) dataset.  I've no idea
what the original poster's two datasets are like, but if there are, say, an
average 13 observations in the first file that match each observation in the
second (if it's a look-up table, for instance), then the resulting dataset
will be in the neighborhood of three-quarters of a gigabyte even with
the -nokeep- option.

Joseph Coveney


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Re: memory size 7 data merging
  - From: "Michael Blasnik" <[email protected]>

Prev by Date: st: RE: RE�: st: solving for unobserved hetereogeneity with two simultaneous equations
Next by Date: Re: st: mkdir with local
Previous by thread: st: gllamm gllapred conditional effect plot?
Next by thread: Re: st: Re: memory size 7 data merging
Index(es):
- Date
- Thread