Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: import excel - requiring restart occasionally


From   Billy Schwartz <wkschwartz@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: import excel - requiring restart occasionally
Date   Wed, 2 Nov 2011 20:31:35 -0400

describe" eventually lead to increasing numbers of error 603's when
running the import command. Restarting Stata seems to fix it.

Facts: I have 947 Excel workbooks, both .xls and .xlsx, averaging just
under 3 sheets per book, and I have to organize them. As a first step
I've written a Stata script to create a table of the sheet names in
each workbook. It loops through the *.xls* file names in my folder and
runs "import excel, describe" on each, putting the contents returned
in r() into a data set that accumulates. I open up Stata and run the
script, and it works as intended with 14 error 603s, which I catch and
handle. (So you don't have to look it up, error 603 means the file
could not be opened even though it was found. I don't know why these
error 603's occur since Excel opens the file just fine, but I'm
comfortable with a 1% error rate.) During the debugging process, I've
had to run the script repeatedly in a given instance of Stata. After
three to five runs, the number of error 603's I get goes from 14 to a
couple hundred. One or two more runs gives me an error 603 on each
iteration of the loop, leaving my dataset empty. After restarting
Stata, the problem goes away and I'm back to 14 error 603's. I'm doing
all this on a server, so I've tried reading the data from a different
server over the network and locally on this server. Same pattern both
times.

Question: Has anyone had a similar problem? Can anyone replicate this?
The code below the fold is a skeleton of the algorithm I'm using in
case anyone wants to try. If everyone's out of ideas, I guess I'll
submit this to Stata as a bug report.

Technical details in case it matters: My copy of Stata 12/MP (4-core
license, born on 13oct2011) runs on Windows Server Enterprise 2007 SP1
(64bit, 32GB RAM, four 4-core Xeons @ 1.87Ghz).

*! stata
version 12
clear
local directory "." //replace as appropriate
local files: dir "`directory'" files "*.xls*", respectcase
generate book_name = ""
generate int error = .
generate sheet_name = ""
generate sheet_range = ""
foreach file of local files {
	capture import excel using "`directory'/`file'", describe
	if c(rc) == 603 {
		set obs `=c(N)+1'
		replace book_name = "`file'" in `c(N)'
		replace error = c(rc) in `c(N)'
		continue
	}
	else error c(rc)
	forvalues i = 1/`r(N_worksheet)' {
		set obs `=c(N)+1'
		replace book_name = "`file'" in `c(N)'
		replace error = c(rc) in `c(N)'
		replace sheet_name = r(worksheet_`i') in `c(N)'
		replace sheet_range = "`r(range_`i')'" in `c(N)' //range may be missing
	}
}
count if error //number of errors
*end
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index