Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: --insheet--reading an excel file where row 1,2 are variable names and labels


From   Nick Cox <n.j.cox@durham.ac.uk>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   st: RE: --insheet--reading an excel file where row 1,2 are variable names and labels
Date   Thu, 23 Sep 2010 16:09:30 +0100

In essence, Stata, meaning -insheet-, doesn't know about your private convention of using your second row for extra information. That is, -insheet- works on the basis that at most the first line includes metadata, as is stated, or at least clearly implied, in the help. A corollary is that the second line is always treated as data, and -insheet- does not include options to override that. 

That could be regarded as a weakness of -insheet-, but the whole business is a slippery slope. For example, many people include extra blank lines in their worksheets, or add text annotations to certain cells, and so on, and so forth. The simplest and ultimately most practical line to take is that Stata is not MS Excel and does not purport to understand your spreadsheet data in the way that you do. Stata puts the responsibility on users to work out what management they need to undertake. 

See 

http://www.stata.com/support/faqs/data/newexcel.html

for more context. On a personal note, my notional authorship of this FAQ does not signal any kind of expertise in MS Excel, a program I try to avoid to the maximum extent possible. Rather, people who use both MS Excel and Stata came to me with all sorts of little problems and in self-defence I started to work out and/or collate some solutions. 

In terms of your specific query, your alternatives include 

1. Writing a program optimised for your specific spreadsheet conventions. That demands some programming skill and will only be a good idea if you have lots of such files. 

2. Surgery in Stata, including use of -destring-, after import and looping over the values in observation 2 to define variable labels.

3. Use of some other import command. 

Nick 
n.j.cox@durham.ac.uk 

Amanda Fu

I got a question when I tried to read an excel file (.csv) into Stata.
In the excel file, the first row and the second row are for the
variable name and labels respectively. But when using --insheet
using---, Stata renamed the variables and the labels are not shown.
Then my question is: is there any easy way to let Stata name the
variables using first row and define labels using the second row when
reading a data set from excel file?

------------------------------------------------------------EXCEL file
name	state	age	a11                     a12
name	state	age	interview type       household NO.
AJ	MA	21	a                           001
.........
------------------------------------------------------------
After using --insheet---, the stata data file looks like this:
-------------------------------------------------------------census_variables_copy
var1        var2            var3          var4                     var5
name	state	age	a11                     a12
name	state	age	interview type       household NO.
AJ	MA	21	a                           001
.........
------------------------------------------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index