Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Import csv file


From   "Nuno Soares" <ns.mlists@gmail.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Import csv file
Date   Mon, 28 Aug 2006 20:56:17 +0100

Thanks a lot Radu! Just had to destring the date_ variables. Here goes the code. Might be useful to other statalisters:

***************** BEGIN CODE ***********************

clear
*reading in the data
insheet using "C:\mv1.csv", clear

*transposing
sxpose, clear

*renaming the first three variables
foreach v of varlist _var1 _var2 _var3 {
local i = `v'[1] 
rename `v' `i'
}

*getting rid of "/" in the dates
*note that after renaming, only the date variables start with "_var"
destring _var*, replace ignore("/") force

*now renaming the remaining variables
foreach v of varlist _var* {
local i = `v'[1]
rename `v' data`i'
}

*dropping the first observation (because this is just the name of the variable) drop in 1

*reshaping
reshape long data, i(Name) j(date) string


*now you're left with the date in a string format, which you can convert to a proper date, in a couple of steps

gen str date_year = substr(date, -4,.)
gen str date_month = substr(date, 3, 2)
gen str date_day = substr(date, 1, 2)
destring  date_year date_month date_day, replace

gen truedate = mdy(date_month, date_day, date_year)
format truedate %d

***************** END CODE ***********************

Cheers,



-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Radu Ban
Sent: 28 August 2006 20:20
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: Import csv file

Nuno,

Here's a try. For this you would need to install the -sxpose- command (do a -findit sxpose- and then follow the link to install). This is needed because the built-in -xpose- command doesn't work with strings.

******************************

*reading in the data
insheet using "C:\somedir\myfile.csv", clear

*transposing
sxpose, clear

*renaming the first three variables
foreach v of varlist _var1 _var2 _var3 { local i = `v'[1] rename `v' `i'
}

*getting rid of "/" in the dates
*note that after renaming, only the date variables start with "_var"
destring _var*, replace ignore("/") force

*now renaming the remaining variables
foreach v of varlist _var* {
local i = `v'[1]
rename `v' data`i'
}

*dropping the first observation (because this is just the name of the variable) drop in 1

*reshaping
reshape long data, i(Name) j(date) string


*now you're left with the date in a string format, which you can convert to a proper *date, in a couple of steps

gen str date_year = substr(date, -4,.)
gen str date_month = substr(date, 3, 2)
gen str date_day = substr(date, 1, 2)

gen truedate = mdy(date_month, date_day, date_year)

*now you can format the date using the preferred date format, for example format truedate %d
********************

I hope this helps. Also if you want to run this for several .csv files, check out the archives for last week, there's a thread on how to loop over all files in a directory.

-Radu Ban

2006/8/28, Nuno Soares <ns.mlists@gmail.com>:
> Hi everyone,
>
> I need to import some csv files that have an awkward format. The format is always the same:
>
> Name Company1 Company2 Company3 ...
> Code 0000(MV) 0001(MV) 0002(MV) ...
> Currency £ £ € ...
> 31/12/1986 data1 data2 data3 ...
> 30/01/1987 dataX dataY dataZ ...
> ...
> 30/12/2005 dataW dataJ dataK ...
>
> The number of companies is variable (ie, columns), and its width is also variable. Data1, data2, etc. are numeric.
>
> Can you please tell me if it is possible to import to Stata the data described, in the following format:
>
> Company  Code   Currency  Date  Data
> Company1 0000(MV) £   31/12/1986 data1
> Company1 0000(MV) £   30/01/1987 dataX
> ...
> Company1 0000(MV) £   30/12/2005 dataW
> Company2 0001(MV) £   31/12/1986 data2
> Company2 0001(MV) £   30/01/1987 dataY
> ...
> Company2 0001(MV) £   30/12/2005 dataJ
> Company3 0002(MV) £   31/12/1986 data3
> Company3 0002(MV) £   30/01/1987 dataZ
> ...
> Company3 0002(MV) £   30/12/2005 dataK
> and so on.
>
> Some treatment on the original data can be done in Excel, but I wanted to avoid it (I have almost one hundred files that I need to import...). If any one has an idea of how to do it, I would be greatly appreciated!
>
> Best,
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index