Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: large data sets (was st: A faster way to gsort)

From	Jeph Herrin <[email protected]>
To	[email protected]
Subject	Re: large data sets (was st: A faster way to gsort)
Date	Fri, 14 Mar 2014 10:04:05 -0400


On 3/13/2014 10:44 PM, Joseph Coveney wrote:


It sounds like you're pulling modest-to-large result sets out of the database,
saving them as SAS dataset files and then going back and sort-merging them via
PROC SQL with multigigabyte-sized result sets likewise pulled out of the
database en passant--a situation that even SAS aficionados recommend avoiding in
favor of pass-through queries.

I have not being doing that, but it is what the SAS analysts in thisenvironment do - and it's one reason they prefer not to use Stata. I doas much as I can in native SQL, and then roll the results up in Stata.But this requires iterating queries over eg calendar year to ensure thatthe results I pull down are manageably small.

But the first point is an important one - my primary role here is notdata analyst, mostly there are other analysts using SAS to createdatasets that I can analyze in Stata. And it is likely to stay that wayas long as SAS has the edge on data management using large databases.


cheers,
Jeph
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: A faster way to gsort
  - From: Andrew Maurer <[email protected]>
- Re: st: A faster way to gsort
  - From: Maarten Buis <[email protected]>
- RE: st: A faster way to gsort
  - From: Joe Canner <[email protected]>
- RE: st: A faster way to gsort
  - From: Joe Canner <[email protected]>
- RE: st: A faster way to gsort
  - From: Joe Canner <[email protected]>
- Re: st: A faster way to gsort
  - From: Nick Cox <[email protected]>
- RE: st: A faster way to gsort
  - From: Joe Canner <[email protected]>
- large data sets (was st: A faster way to gsort)
  - From: Jeph Herrin <[email protected]>
- Re: large data sets (was st: A faster way to gsort)
  - From: "Joseph Coveney" <[email protected]>
- Re: large data sets (was st: A faster way to gsort)
  - From: Jeph Herrin <[email protected]>
- Re: large data sets (was st: A faster way to gsort)
  - From: "Joseph Coveney" <[email protected]>

Prev by Date: RE: st: Statistical Significance of the difference between two estimates from two separate regressions
Next by Date: Re: st: first stage results using ivregress with vce(cluster)
Previous by thread: Re: large data sets (was st: A faster way to gsort)
Next by thread: Re: st: A faster way to gsort
Index(es):
- Date
- Thread