[D] import excel -- Import and export Excel files
Syntax
Load an Excel file
import excel [using] filename [, import_excel_options]
Load subset of variables from an Excel file
import excel extvarlist using filename [, import_excel_options]
Describe contents of an Excel file
import excel [using] filename, describe
Save data in memory to an Excel file
export excel [using] filename [if] [in] [, export_excel_options]
Save subset of variables in memory to an Excel file
export excel [varlist] using filename [if] [in] [,
export_excel_options]
import_excel_options Description
-------------------------------------------------------------------------
sheet("sheetname") Excel worksheet to load
cellrange([start][:end]) Excel cell range to load
firstrow treat first row of Excel data as
variable names
case(preserve|lower|upper) preserve the case (the default) or
read variable names as lowercase
or uppercase when using firstrow
allstring[("format")] import all Excel data as strings;
optionally, specify the numeric
display format
clear replace data in memory
locale("locale") specify the locale used by the
workbook; has no effect on
Microsoft Windows
-------------------------------------------------------------------------
allstring("format") and locale() do not appear in the dialog box.
export_excel_options Description
-------------------------------------------------------------------------
Main
sheet("sheetname") save to Excel worksheet
cell(start) start (upper-left) cell in Excel to
begin saving to
sheetmodify modify Excel worksheet
sheetreplace replace Excel worksheet
firstrow(variables|varlabels) save variable names or variable
labels to first row
nolabel export values instead of value
labels
keepcellfmt when writing data, preserve the cell
style and format of existing
worksheet
replace overwrite Excel file
Advanced
datestring("datetime_format") save dates as strings with a
datetime_format
missing("repval") save missing values as repval
locale("locale") specify the locale used by the
workbook; has no effect on
Microsoft Windows
-------------------------------------------------------------------------
locale() does not appear in the dialog box.
extvarlist specifies variable names of imported columns. An extvarlist
is one or more of any of the following:
varname
varname=columnname
Example: import excel make mpg weight price using auto.xlsx, clear
imports columns A, B, C, and D from the Excel file auto.xlsx.
Example: import excel make=A mpg=B price=D using auto.xlsx, clear
imports columns A, B, and D from the Excel file auto.xlsx.
Column C and any columns after D are skipped.
Menu
import excel
File > Import > Excel spreadsheet (*.xls;*.xlsx)
export excel
File > Export > Data to Excel spreadsheet (*.xls;*.xlsx)
Description
import excel loads an Excel file, also known as a workbook, into Stata.
import excel filename, describe lists available sheets and ranges of an
Excel file. export excel saves data in memory to an Excel file. Excel
1997/2003 (.xls) files and Excel 2007/2010 (.xlsx) files can be imported,
exported, and described using import excel, export excel, and import
excel, describe.
import excel and export excel are supported on Windows, Mac, and Linux.
import excel and export excel look at the file extension, .xls or .xlsx,
to determine which Excel format to read or write.
For performance, import excel imposes a size limit of 40 MB for Excel
2007/2010 (.xlsx) files. Be warned that importing large .xlsx files can
severely affect your machine's performance.
import excel auto first looks for auto.xls and then looks for auto.xlsx
if auto.xls is not found in the current directory.
The default file extension for export excel is .xls if a file extension
is not specified.
Options for import excel
sheet("sheetname") imports the worksheet named sheetname in the workbook.
The default is to import the first worksheet.
cellrange([start][:end]) specifies a range of cells within the worksheet
to load. start and end are specified using standard Excel cell
notation, for example, A1, BC2000, and C23.
firstrow specifies that the first row of data in the Excel worksheet
consists of variable names. This option cannot be used with
extvarlist. firstrow uses the first row of the cell range for
variable names if cellrange() is specified. import excel translates
the names in the first row to valid Stata variable names. The
original names in the first row are stored unmodified as variable
labels.
case(preserve|lower|upper) specifies the case of the variable names read
when using the firstrow option. The default is case(preserve),
meaning to preserve the variable name case. Only the ASCII letters
in names are changed to lowercase or uppercase. Unicode characters
beyond ASCII range are not changed.
allstring[("format")] forces import excel to import all Excel data as
string data. You can specify the numeric display format used to
convert the numeric data to string using the optional argument
format. See [D] format.
clear clears data in memory before loading data from the Excel workbook.
The following option is available with import excel but is not shown in
the dialog box:
locale("locale") specifies the locale used by the workbook. You might
need this option when working with extended ASCII character sets.
This option has no effect on Microsoft Windows. The default locale
is UTF-8.
Options for export excel
+------+
----+ Main +-------------------------------------------------------------
sheet("sheetname") saves to the worksheet named sheetname. If there is
no worksheet named sheetname in the workbook, a new sheet named
sheetname is created. If this option is not specified, the first
worksheet of the workbook is used.
cell(start) specifies the start (upper-left) cell in the Excel worksheet
to begin saving to. By default, export excel saves starting in the
first row and first column of the worksheet.
sheetmodify exports data to the worksheet without changing the cells
outside the exported range. sheetmodify cannot be combined with
sheetreplace or replace.
sheetreplace clears the worksheet before the data are exported to it.
sheetreplace cannot be combined with sheetmodify or replace.
firstrow(variables|varlabels) specifies that the variable names or the
variable labels be saved in the first row in the Excel worksheet.
The variable name is used if there is no variable label for a given
variable.
nolabel exports the underlying numeric values instead of the value
labels.
keepcellfmt specifies that, when writing data, export excel should
preserve the existing worksheet's cell style and format. By default,
export excel does not preserve a cell's style or format.
replace overwrites an existing Excel workbook. replace cannot be
combined with sheetmodify or sheetreplace.
+----------+
----+ Advanced +---------------------------------------------------------
datestring("datetime_format") exports all datetime variables as strings
formatted by datetime_format. See [D] datetime display formats.
missing("repval") exports missing values as repval. repval can be either
string or numeric. Without specifying this option, export excel
exports the missing values as empty cells.
The following option is available with export excel but is not shown in
the dialog box:
locale("locale") specifies the locale used by the workbook. You might
need this option when working with extended ASCII character sets.
The default locale is UTF-8.
Remarks/Examples
To demonstrate the use of import excel and export excel, we will first
load auto.dta and export it as an Excel file named auto.xls:
. webuse auto
(1978 Automobile Data)
. export excel auto, firstrow(variables)
file auto.xls saved
Now we can import from the auto.xls file we just created, telling Stata
to clear the current data from memory and to treat the first row of the
worksheet in the Excel file as variable names:
. import excel auto.xls, firstrow clear
. describe
Contains data
obs: 74
vars: 12
size: 3,922
-----------------------------------------------------------------------
storage display value
variable name type format label variable label
-----------------------------------------------------------------------
make str17 %17s make
price int %10.0g price
mpg byte %10.0g mpg
rep78 byte %10.0g rep78
headroom double %10.0g headroom
trunk byte %10.0g trunk
weight int %10.0g weight
length int %10.0g length
turn byte %10.0g turn
displacement int %10.0g displacemen
gear_ratio double %10.0g gear_ratio
foreign str8 %9s foreign
-----------------------------------------------------------------------
Sorted by:
Note: Dataset has changed since last saved.
We can also import a subrange of the cells in the Excel file:
. import excel auto.xls, cellrange(:D70) firstrow clear
. describe
Contains data
obs: 69
vars: 4
size: 1,449
-----------------------------------------------------------------------
storage display value
variable name type format label variable label
-----------------------------------------------------------------------
make str17 %17s make
price int %10.0g price
mpg byte %10.0g mpg
rep78 byte %10.0g rep78
-----------------------------------------------------------------------
Sorted by:
Note: Dataset has changed since last saved.
Both .xls and .xlsx files are supported by import excel and export excel.
If a file extension is not specified with export excel, .xls is assumed,
because this format is more common and is compatible with more
applications that also can read from Excel files. To save the data in
memory as a .xlsx file, specify the extension:
. webuse auto, clear
(1978 Automobile Data)
. export excel auto.xlsx
file auto.xlsx saved
To export a subset of variables and overwrite the existing auto.xls Excel
file, specify a variable list and the replace option:
. export excel make mpg weight using auto, replace
file auto.xls saved
Video example
Import Excel data into Stata
Technical note: Excel data size limits
For an Excel .xls-type workbook, the worksheet size limits are 65,536
rows by 256 columns. The string size limit is 255 characters.
For an Excel .xlsx-type workbook, the worksheet size limits are 1,048,576
rows by 16,384 columns. The string size limit is 32,767 characters.
Technical note: Dates and times
Excel has two different date systems, the "1900 Date System" and the
"1904 Date System". Excel stores a date and time as an integer
representing the number of days since a start date plus a fraction of a
24-hour day.
In the 1900 Date System, the start date is 00Jan1900; in the 1904 Date
System, the start date is 01Jan1904. In the 1900 Date System, there is
another artificial date, 29feb1900, besides 00Jan1900. import excel
translates 29feb1900 to 28feb1900 and 00Jan1900 to 31dec1899.
See Using dates and times from other software in [D] datetime for a
discussion of the relationship between Stata datetimes and Excel
datetimes.
Technical note: Mixed data types
Because Excel's data type is cell based, import excel may encounter a
column of cells with mixed data types. In such a case, the following
rules are used to determine the variable type in Stata of the imported
column.
If the column contains at least one cell with nonnumerical text, the
entire column is imported as a string variable.
If an all-numerical column contains at least one cell formatted as a
date or time, the entire column is imported as a Stata date or
datetime variable. import excel imports the column as a Stata date
if all date cells in Excel are dates only; otherwise, a datetime is
used.
Stored results
import excel filename, describe stores the following in r():
Macros
r(N_worksheet) number of worksheets in the Excel workbook
r(worksheet_#) name of worksheet # in the Excel workbook
r(range_#) available cell range for worksheet # in the Excel
workbook