[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: import excel
Eric Booth <email@example.com>
Re: st: import excel
Thu, 12 Apr 2012 08:16:57 -0500
On Apr 12, 2012, at 5:22 AM, Jenniffer Solorzano Mosquera wrote:
> Is there any way to import several files from excel with a loop in such a =
> way I preserve the original format of the data?
Yes, I would use the techniques discussed in these threads for importing a large # of files using a loop and then converting to .dta format (and possibly combining/appending the data at the same time):
http://www.stata.com/statalist/archive/2012-04/msg00317.html "st: Re: Concatenate files"
http://www.stata.com/statalist/archive/2010-01/msg00239.html "Re: global macro for system directory?"
http://www.stata.com/statalist/archive/2010-11/msg00111.html "Re: st: repeat same commands over hundreds of files"
> I have 100 excel reports of 100 firms each one. The problem is that every =
> single row is a firm but the firm has many observations by date in the =
> same cell. In excel you can see that with format the information is =
> possible to read because they are separated in terms of format but they =
> are in the same cell.=20
> For example:
> This list is information of firm 1, and it is only in one cell:
> This list is for firm 2, in the next row and so on:
> When I import everything appears in only one cell in the dta file. The =
> problem is that there is no regular pattern of the variables. Some columns =
> are value of investments (not only dates) so you can not separate by =
> programming a regular loop to identify the information imported. I can =
> identify the dates because they are made from 10 digits always, but the =
> same doesn't happen with other type of information.
> Any recommendation?
You first need to inspect whether all your data in those excel long cells are making it into Stata when you -import- the file -- Stata has a string length lim. of 244 chars, so if any of those combined cells are too long, you'll have truncated data and need to consider pre-processing the file. You can do that in Excel or possibly use -filefilter- (-help filefilter-).
If all the data in these long cells are making it into Stata, then your goal is to -split- the cells using either -split- or some combination of string functions (-help string_functions-).
If all your cells looked like the data example you gave, then you could run:
and it would create separate vars from this space-delimited string. However, where you have other delimiters in a cell/string, you'd need to adjust -split-'s parse() option (or some other string_function) accordingly. Finally, if you have different numbers of elements within a string, you'll get different numbers of variables created by -split-, so you'll have to do some work to organize your new variables, but without seeing that data I couldn't guess about the strategy.
Eric A. Booth
Public Policy Research Institute
Texas A&M University
* For searches and help try: