Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: load EXCEL to STATA


From   Phil Schumm <pschumm@uchicago.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: load EXCEL to STATA
Date   Thu, 1 Apr 2010 07:29:00 -0500

On Mar 31, 2010, at 10:16 PM, G. Dai wrote:
I have a very big data set which is consist of hundreds of smaller EXCEL table sheet. Each EXCEL table sheet actually includes many county observations in one state.

To read the EXCEL in my STATA, I need first OPEN the EXCEL file and then SAVE it as CSV form. However, this easy task seems very time consuming since hundreds of such sheet need to be open and save!

So, is there any easy to read the EXCEL sheet without saving it into CSV form? Last, the EXCEL sheet actually includes a title and short description notes at the beginning of the sheet, which is very annoying.


As Martin suggested, Stat/Transfer would make short work of this, including giving you the ability to exclude the title and description at the beginning of each sheet. Stat/Transfer is a great program, and should be part of any data analyst's toolkit.

That said, another possibility you might consider is the xlrd module available for Python (http://pypi.python.org/pypi/xlrd). This will permit you to work with the data (and even the formatting) in Excel files without having to deal with Microsoft's Component Object Model (i.e., you can use it on any platform). A good tutorial is available here:

    http://www.python-excel.org/

For example, to read all of the sheets in a workbook and write them to tab-delimited files (which you could then -insheet- in Stata), you could do something like this:


import xlrd, csv

workbook = xlrd.open_workbook('workbook.xls')
for name in workbook.sheet_names():
    sheet = workbook.sheet_by_name(name)
    data = []
    for row in range(sheet.nrows):
        data.append(sheet.row_values(row))
    outfile = open('%s.txt' % name, mode='w')
    writer = csv.writer(outfile, dialect='excel-tab')
    writer.writerows(data)
    outfile.close()


Personally, I use Stat/Transfer whenever possible, but occasionally use xlrd when (1) the format of the Excel workbook is funky enough that Stat/Transfer can't handle it, or (2) I have to provide a solution to someone who doesn't own Stat/Transfer.


-- Phil

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index