Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: import excel-file too big


From   Hua Peng <hpeng@stata.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: import excel-file too big
Date   Wed, 24 Aug 2011 12:05:13 -0500

Ricardo Ovaldia asked how to deal with large Excel files:

>
>I wrote an -do- file that first imports a series of excel sheets into
>Stata  using the -import- command.
>
>However, some of the sheets are too big and I get the corresponding
>error message "File too big".
>

There is an undocumented setting

        set excelxlsxlargefile on

which will allow -import excel- to bypass the size checking.

But Ricardo should be warned, the library we use to
import Excel files has a large memory footprint when dealing with
large new xml based xlsx files.  Also the library currently has
no ability to allow user to break during the middle of loading an
Excel file.  Hence if Ricardo's do file attempts to load a large
Excel xlsx file, his Stata session will become unresponsive
until it finishes.  During this time, Ricardo will not be able to
break out using the break button.

Eric Booth observed:
>
>Importantly, when I try my second suggestion about importing part of
>the excel file using the cellrange() option to -import excel- and
>then piecing that together with append, I can see now that it will not
>work.  The reason is that if the Excel file is too big to import,
>Stata cannot import any subset (e.g., even cellrange(A1:B2) fails
>with a 'file too big' error).  I suppose this probably makes sense
>if Stata has to load the entire Excel file in order to find some
>subset -- when Stata initially inspects the file it is either too
>big or not.
>

Eric is right.  The library requires the entire Excel file be
loaded and parsed.  Thus, the most time-intensive and
system resource demanding part is in fact the initial loading
process.


Subrata Bhattacharyya suggested using -odbc-, which should work.
If Ricardo need assistance with -odbc-, please contact Stata
Technical Services via email (tech-support@stata.com).


        Hua
        --hpeng@stata.com


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index