Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: selecting obs while reading in huge data set


From   "Sascha O. Becker" <sascha.becker@gmx.de>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: selecting obs while reading in huge data set
Date   Thu, 19 Aug 2004 08:59:28 +0200

Dear Daniel,

thanks for your reply!

You suggested:

****
Perhaps you can read the employee and firm ID only?

.insheet empid firmid using mydata

This is only 1/5th the variables, so it might fit in your computer memory.
Then merge the result with the firm dataset, keeping only matched records, then merge again with employee dataset, keeping only matched records.
****

This last step is actually identical to the original problem. "The employee dataset" is the full dataset with all variables. In order to merge this to anything, it needs to be in memory at least once, and this is exactly the problem.

There seems to be no way round some kind of looping, either over observations, or over subsets of variables that I would merge against the firm data set and then append/merge those sub-datasets.

Cheers, Sascha
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index