Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: avoiding StatTransfer: huge / large / big dataset from SAS/ csv


From   Daniel Feenberg <feenberg@nber.org>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: avoiding StatTransfer: huge / large / big dataset from SAS/ csv
Date   Tue, 26 Oct 2004 13:13:19 -0400 (EDT)

On Tue, 26 Oct 2004, Daniel Egan wrote:

> Hello, 
> 
> I am trying to get a ~3 GB .csv dataset into Stata. I don't think it
> will be anywhere near 3 GB once in Stata, but there it is, on my

... 
> 
> I believe I have X options to get the data into Stata, all of which
> are missing a vital step that I am not sure how to do, or have
> available:
> 
> 1) Export the data to csv files from SAS in segments, i.e. 1st
> 1million obs, 2 millions obs etc... Then import each of these into
> Stata and merge. I am not sure how to tell SAS to sort and then export
> based on a criteria however.

SAS code to go into data step to keep only observations between 1 and 1
million:

if _n_ ge 1 and _n_ lt 1000000;

(_n_ is the current record number, the if statement with no action is
an implied "keep".

> 
> 2) Do the analogous method in Stata, but using -infile-. The problem
> is that -infile- with [in] requires the data to be in a fixed format.

"help infile1" suggests that "in range" is allowed. Am I confused? You
could also do a selection by variable.

> As far as I know, SAS can only export delimited. If I could export the
> data from SAS in a fixed format, that would work.

SAS code to write in fixed format:

put (a b c ) ( 12.0 9.3 14.2 );

where a, b and c are variable names and 9.3 (for example) specifies 9
columns and 3 decimal places. You will probably want to specify an
"lrecl=nnnn" on the "file" statement to have a record length longer than
132 bytes:

file "bigfile.raw" lrecl=13000;

> 
> 3) I have seen various work-arounds in Statalist/FAQs with large
> datasets using OBDC. I do not know anything about OBDC, but if its the
> only way to go, I will learn.

I can't help with this.

> 
> 4) I know about StatTransfer, but I am not the one making decisions
> about buying new software/licenses, and don't particularly want to go
> through that if I don't have to.

At least start the purchasing process going. Someday you will need it.

> 
> 
> Any guidance, suggestions, or clever responses are very much appreciated. 
> 
> Regards, 
> Dan
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index