Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Working with large data set like a database


From   David Souther <[email protected]>
To   [email protected]
Subject   st: Working with large data set like a database
Date   Fri, 1 Jan 2010 10:37:55 -0600

I have a question about working with  very large data sets (combined
sizes ~ 40 gig) to run analysis when only 6 gig of memory is
available.
A second complicating factor is that I need to join some of these data
sets together based on a date range or similar join rule.  In Oracle,
I could query out only the columns I need and then join them to other
files using a rule- such as the dates being within "x" number of days
of each other.  I cannot get ""merge"" in stata to accept these kinds
of date ranges.  Here are an example of two datasets to join

***subdataset***
date1 var1 extravar extravar1
10/22/2008 3 44 44
02/01/2001 5 44 44
05/24/2005 9 44 44
12/12/2012 99 44 44
12/29/2012 100 44 44

***big dataset***
 date1 var2 extravar extravar1
10/20/2008 500 44 44
02/07/2001 500 44 44
05/20/2005 900 44 44
12/12/2015 990 44 44
01/01/1999 1000 44 44
01/01/1970 2000 44 44
01/01/1970 2222 44 44
12/01/2012 7777 44 55


I need to join by ""date1"" and load up a data set for analysis with
ONLY ""date1"", ""var1"", and ""extravar1"".  Thanks for helping.

DFS
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index