Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Want to load only part of ASCII file


From   Jon Gettman <jgettman@hughes.net>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Want to load only part of ASCII file
Date   Wed, 21 Jul 2010 10:45:03 -0400

When you open a delimited file in Excel you get a Text Import Wizard. The third stage of the wizard allows you to skip selected columns/variables when loading the file. So one solution is to load the file in Excel, only select the variables you need, and then save the resulting file as a text(tab) delimited text file. Then use the insheet command to load this into Stata. (Make sure your vars in Excel are formatted appropriately as text or numbers, and the format will carry over. If a numerical var shows up in State as a string variable, it's because it has a non-numerical character in a record.)

If you record this process as a Macro in Excel and examine the code you will discover that the programming is very much like the use of a Stata dictionary. This suggests an alternate course of action. Write a dictionary file only referencing the columns you wish to load, and then use the infile command to load the ASCII file. The dictionary specifies the column location, name, format, and length of the data sought. Infile is the real way to go here.

Jon Gettman






At 09:27 AM 7/21/2010, you wrote:
Sometimes the best solution isn't a Stata solution; for this kind of
problem I usually invoke some other software. For instance,
Stat/Transfer will convert a .csv file to a Stata file while retaining
only selected variables, and does not require vast RAM. Or, you can
open in Excel, delete a few columns, and save as a new .csv file;
Excel also buffers to disk and hence needs less RAM. (If you go the
Excel route, though, be exceedingly careful, because Excel makes it
too easy to corrupt your data  - for instance, by sorting a single
column.)

hth,
Jeph


On 7/20/2010 7:38 PM, enewton@notes.cc.sunysb.edu wrote:
Hi,

I'm new to Stata, and new to this listserv. I'd like to load a very
large ASCII .csv file into a .dta Stata file but it keeps bumping up
against the 1.1g memory limit and Stata suggests I "drop some
observations or variables." My question is how one can load just a
portion of an ASCII file--for example, the specific variables of
interest, or only those observations where variable1=xyz--so as to
end up with a more manageable Stata file.

A second question is why I can't set memory higher than about 1.3g,
when I have about 100g of space left on the computer, but that's
probably something my computer&  I need to work out between
ourselves.

Thanks.

* *   For searches and help try: *
http://www.stata.com/help.cgi?search *
http://www.stata.com/support/statalist/faq *
http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index