Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: import excel-file too big


From   Eric Booth <[email protected]>
To   "<[email protected]>" <[email protected]>
Subject   Re: st: import excel-file too big
Date   Mon, 22 Aug 2011 16:02:32 +0000

<>

Update:  Initially, I couldn't create an Excel sheet large enough to replicate Ricardo's 'file too big' error before because I had too little memory available for excel to fully load/open the large file I created.  After freeing up some memory, I created a larger .xlsx file and now when I try to use -import excel- I get the same 'file too big' error Ricardo reported. 

Importantly, when I try my second suggestion about importing part of the excel file using the 'cellrange()' option to -import excel- and then piecing that together with append, I can see now that it will not work.  The reason is that if the Excel file is too big to import, Stata cannot import any subset (e.g., even cellrange(A1:B2) fails with a 'file too big' error).  I suppose this probably makes sense if Stata has to load the entire Excel file in order to find some subset -- when Stata initially inspects the file it is either too big or not.

__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[email protected]
Office: +979.845.6754





On Aug 22, 2011, at 10:44 AM, Eric Booth wrote:

> <>
> 
> I've never gotten -odbc- to work nicely with the Mac version (I suspect this is due to using 3rd party/free drivers), so I usually use Roger Newson's -stcmd- (from SSC) to automate using the program StatTransfer to convert the excel files to comma/tab delimited files or directly to Stata .dta files in a loop.  
> 
> You might be able to use -import excel- with the 'cellrange()' option to get around the 'file too big' error (?).  You could import the top half of each file and the bottom half and then concatenate them together.  You might have to experiment to find the range where the file becomes to big to import into Stata and then select a cell range that is smaller than this limit. So, in pseudo-code if you had 2 variables:
> 
> 
> global files: "mydir" files "*.xlsx", respectcase
> foreach x of global files {
> 	import excel using "`x'", clear firstrow cellra(A1:B????)
> 	sa "top", replace
> 	import excel using "`x'", clear firstrow cellra(A????+1:B???????) // B??... should be the max rows in all your files (for me it's B1048000)
> 	sa "bottom", replace	
> 	**append top/bottom together**
> 	u "top", clear
> 	append using "bottom"
> 	sa "`x'.dta", replace
> 	}
> 	
> Of course finding the max number of rows and the number of variables in your large excel files becomes another challenge.  I created some test data with 1048000 obs and 50 random vars.  I -outsheet-ed this data to a tab delimited file.  I then opened and saved them in Excel (Office 2011 for Mac OSX) in .xlsx format.  Next, I saved this as xlsx and tried to import to Stata using the command:  
> 
> import excel using "test.xlsx", clear firstrow
> 
> and it imported all the rows /vars without the "file too big" error.  When I tried created a file with more than 1048000 rows, Excel only loaded part of the file (Office 2011 for Mac OSX -- maybe there are different capacities for different versions?), so I apparently cannot create a larger test file with Excel.  
> 
> 
> 
> - Eric
> __
> Eric A. Booth
> Public Policy Research Institute
> Texas A&M University
> [email protected]
> Office: +979.845.6754
> 
> 
> On Aug 22, 2011, at 8:30 AM, Ricardo Ovaldia wrote:
> 
>> 
>> Hello,
>> 
>> I wrote an -do- file that first imports a series of excel sheets into Stata  using the -import- command.
>> However, some of the sheets are too big and I get the corresponding error message "File too big".
>> I can convert the sheet to a tab delimited txt file and use -insheet- to import. That works, but this is tedious because I have lots of files and they will change from time to time.
>> Is there a way (code) to create the  tab delimited txt file from inside the -do- program? Or does anyone have an alternative that can be automated?
>> 
>> Thank you,
>> Ricardo
>> 
>> Ricardo Ovaldia, MS
>> Statistician 
>> Oklahoma City, OK
>> 
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index