Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: large data sets (was st: A faster way to gsort)


From   Jeph Herrin <[email protected]>
To   [email protected]
Subject   Re: large data sets (was st: A faster way to gsort)
Date   Fri, 14 Mar 2014 10:04:05 -0400


On 3/13/2014 10:44 PM, Joseph Coveney wrote:

It sounds like you're pulling modest-to-large result sets out of the database,
saving them as SAS dataset files and then going back and sort-merging them via
PROC SQL with multigigabyte-sized result sets likewise pulled out of the
database en passant--a situation that even SAS aficionados recommend avoiding in
favor of pass-through queries.


I have not being doing that, but it is what the SAS analysts in this environment do - and it's one reason they prefer not to use Stata. I do as much as I can in native SQL, and then roll the results up in Stata. But this requires iterating queries over eg calendar year to ensure that the results I pull down are manageably small.

But the first point is an important one - my primary role here is not data analyst, mostly there are other analysts using SAS to create datasets that I can analyze in Stata. And it is likely to stay that way as long as SAS has the edge on data management using large databases.

cheers,
Jeph
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index