Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Import csv file


From   "Radu Ban" <raduban@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Import csv file
Date   Mon, 28 Aug 2006 15:20:02 -0400

Nuno,

Here's a try. For this you would need to install the -sxpose- command
(do a -findit sxpose- and then follow the link to install). This is
needed because the built-in -xpose- command doesn't work with strings.

******************************

*reading in the data
insheet using "C:\somedir\myfile.csv", clear

*transposing
sxpose, clear

*renaming the first three variables
foreach v of varlist _var1 _var2 _var3 {
local i = `v'[1]
rename `v' `i'
}

*getting rid of "/" in the dates
*note that after renaming, only the date variables start with "_var"
destring _var*, replace ignore("/") force

*now renaming the remaining variables
foreach v of varlist _var* {
local i = `v'[1]
rename `v' data`i'
}

*dropping the first observation (because this is just the name of the variable)
drop in 1

*reshaping
reshape long data, i(Name) j(date) string


*now you're left with the date in a string format, which you can
convert to a proper *date, in a couple of steps

gen str date_year = substr(date, -4,.)
gen str date_month = substr(date, 3, 2)
gen str date_day = substr(date, 1, 2)

gen truedate = mdy(date_month, date_day, date_year)

*now you can format the date using the preferred date format, for example
format truedate %d
********************

I hope this helps. Also if you want to run this for several .csv
files, check out the archives for last week, there's a thread on how
to loop over all files in a directory.

-Radu Ban

2006/8/28, Nuno Soares <ns.mlists@gmail.com>:
Hi everyone,

I need to import some csv files that have an awkward format. The format is always the same:

Name Company1 Company2 Company3 ...
Code 0000(MV) 0001(MV) 0002(MV) ...
Currency £ £ € ...
31/12/1986 data1 data2 data3 ...
30/01/1987 dataX dataY dataZ ...
...
30/12/2005 dataW dataJ dataK ...

The number of companies is variable (ie, columns), and its width is also variable. Data1, data2, etc. are numeric.

Can you please tell me if it is possible to import to Stata the data described, in the following format:

Company  Code   Currency  Date  Data
Company1 0000(MV) £   31/12/1986 data1
Company1 0000(MV) £   30/01/1987 dataX
...
Company1 0000(MV) £   30/12/2005 dataW
Company2 0001(MV) £   31/12/1986 data2
Company2 0001(MV) £   30/01/1987 dataY
...
Company2 0001(MV) £   30/12/2005 dataJ
Company3 0002(MV) £   31/12/1986 data3
Company3 0002(MV) £   30/01/1987 dataZ
...
Company3 0002(MV) £   30/12/2005 dataK
and so on.

Some treatment on the original data can be done in Excel, but I wanted to avoid it (I have almost one hundred files that I need to import...). If any one has an idea of how to do it, I would be greatly appreciated!

Best,


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index