Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: RE: st: Appending several excel data sets into one


From   "Sergiy Radyakin" <serjradyakin@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: RE: st: Appending several excel data sets into one
Date   Mon, 17 Sep 2007 14:52:37 +0200

Excel does not provide a facility to access sheet names?????????
Download utilities????
What are we talking about?
------------------------------------------------------------------------------------------
Public Sub Liste()
    ActiveWorkbook.Sheets.Add
    i = 1
    For Each Sheet In ActiveWorkbook.Sheets
        ActiveSheet.Cells(i, 1).Select
        If ActiveSheet.Name = ActiveWorkbook.Sheets(i).Name Then
          Selection.Value = "Index"
          Selection.Font.Bold = 1
        Else
          Selection.Value = ActiveWorkbook.Sheets(i).Name
        End If
        i = i + 1
    Next
End Sub
------------------------------------------------------------------------------------------
This will create an index sheet in an arbitrary Excel file.
(No external utilities or black magic required).
Will probably work on every Excel version from the late 1980s or so...
Couple of lines can obviously be removed, but are left for clarity.


As for the types of variables, you can simply declare cells of having
cirtain type, e.g. numeric, or strings, or dates in exactly the same
way you will do it in any other system. Since Stata supports only two
types of variables, you do not see it very often explicitly, unless
you want to create a new variable. (oh, yes, yes, there are date
variables in Stata 10 too.)  However istead of directly specifiying
the type, you rely on the "The DaVinci Code of the Missings":

gen a = .
gen b = ""

so having seen this, any new-comer will be puzzled by the following line:

gen c = "."

What is the variable type? Is it assigned missing values ? How can I be sure?



Excel will do a good job as long as it is programmed to do it. Just
switch off all automatic "bells and whistles"  and tell it what you
want to do in a program (same as you do in Stata).

At least when it comes to programming, Excel is much superrior to
Stata with its Yeti-like macros found in no other interpreters aside
assembler.

I have originally suggested to append the sheets in Excel, and after
following the responces with StatTransfer etc. still think it is not a
bad idea.

Best regards,
   Sergiy Radyakin





On 9/18/07, Joseph Coveney <jcoveney@bigplanet.com> wrote:
> Allan Reese wrote:
>
> Couple of comments on original question "How can one append several Excel
> sheets in one stata data set in a simple program ..."
>
> 1. With great care!  Excel is so easy to use because it doesn't impose
> typing, and using multiple sheets just increases the problem.  Unless the
> sheets were very well supported by VBA and macros (about as common as Yeti
> or Loch Ness monster), you'll find that apparently similar variables are
> numeric in one file, strings (of varying lengths) in another, or a mixture
> of numbers with some strings that read into Stata as missing.  Dates are a
> nightmare.  My experience has been that Excel sheets have very little chance
> of doing a clean join.
>
> 2. If workbooks contain multiple sheets, you would like to access the sheet
> names.  Excel doesn't provide this facility, but an add-in called
> ASAP-utilities http://www.asap-utilities.com/ includes one to create an
> index sheet listing all sheets in the workbook.  We used this to write
> macros to copy data from multiple sheets within a book into one sheet for
> export.
>
> --------------------------------------------------------------------------------
>
> No disagreement about the challenges involved in receiving data in the form
> of Excel workbooks, but there are a couple of things that make it a little
> easier.
>
> First, you can force Excel to provide you with everything in a mixed-type
> column as a string.  This will avoid mixed types being read into Stata
> inappropriately as missing.  See
> http://dailydoseofexcel.com/archives/2004/06/03/external-data-mixed-data-types/
> or
> http://www.dicks-blog.com/archives/2004/06/03/external-data-mixed-data-types/
> for how; scroll down the Web page until you see the discussion about ODBC,
> which Stata uses.  (The technique is shown in the do-file below.)
>
> Second, Stata's -odbc query- will provide a list of worksheets in an Excel
> workbook, and you can retrieve the information in a round-about manner for
> use in looping through multiple sheets in a workbook when the names of the
> sheets are unknown beforehand.  This is also shown below in the do-file.
> (The same general workaround can also be used in retrieving variable name
> and storage type information from -describe using-, one of the earlier
> questions on the list this past week.)
>
> The probe workbook ("Test.xls") contains two worksheets with two columns
> (column_1 and column_2), each of mixed data types (string, numeric, date).
> I cannot attach the workbook, but you can see its contents in the -list-
> output below the do-file.  The do-file appends all of the data in both
> worksheets into one Stata dataset, all as strings.  Once the data are safely
> ensconced in Stata, you can -destring- or whatever, as needed.
>
> Again, no contention about the data-management abuses facilitated by Excel,
> but like the original poster, I'm often called upon to receive data in that
> format despite my druthers.
>
> Joseph Coveney
>
> clear *
> set more off
> tempfile tmpfil0
> *
> local dsn Excel Files;DBQ=Test.xls;DefaultDir=F:/;
> *
> * Retrieving the names of workbooks into a macro
> *
> quietly log using `tmpfil0', text
> odbc query "`dsn'"
> quietly log close
> *
> infix str244 a 1-244 using `tmpfil0', clear
> generate byte keep = sum(strpos(a, "---") == 1)
> keep if keep == 1
> drop in 1
> compress
> levelsof a, local(spreadsheets)
> *
> * Looping through the list, appending
> *
> * Putting some of the long SQL statement
> * into a macro in order to avoid wrapping
> * in the e-mail to the list
> local prefix [Excel 8.0;HDR=YES;IMEX=1;Database=F:\Test.xls;].
> *
> local sheet 0
> foreach spreadsheet of local spreadsheets {
>    odbc load, ///
>  exec("SELECT * FROM `prefix'[`spreadsheet'];") dsn("`dsn'") clear
>    generate byte sheet = `++sheet'
>    if (`sheet' == 1) {
>        save `tmpfil0', replace
>    }
>    else {
>        append using `tmpfil0'
>        save `tmpfil0', replace
>    }
> }
> erase `tmpfil0'
> list, noobs string(20) abbreviate(15) sepby(sheet)
> exit
>
> Output from the do-file using the probe Excel workbook, Test.xls
>
> . list, noobs string(20) abbreviate(15) sepby(sheet)
>
>  +-------------------------------------------+
>  |     column_1             column_2   sheet |
>  |-------------------------------------------|
>  |           .a        Hello, world!       2 |
>  |           .b                    4       2 |
>  |           .c          01-May-1969       2 |
>  |         7890                   .r       2 |
>  |-------------------------------------------|
>  |        12345   no stinkin' badges       1 |
>  | my fair lady           2007-09-17       1 |
>  +-------------------------------------------+
>
> . exit
>
> end of do-file
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index