Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: need help on reading big ASCII file to make a panel data set on temperature across geographic coordinates


From   Christopher Baum <[email protected]>
To   Jose Ramon Albert <[email protected]>
Subject   st: Re: need help on reading big ASCII file to make a panel data set on temperature across geographic coordinates
Date   Wed, 4 Jan 2012 07:47:06 -0500

Jose,

You're right, you need the file command, but not to read the data! Rather, the file command allows me to write an appropriate dictionary file with use with -infix-. That command will allow me to specify that an observation is contained in a block of 37 lines in the file, but I then have to indicate where the variables appear on each record. A little bit
of housekeeping arithmetic takes care of that.

For some reason the very last block of records shows as incomplete, perhaps because there is a missing end-of-line on the very last line. I have dropped that (Dec 2010) obs. You could probably recover it by adding a blank line to the file and telling it to read 1932 obs.

Kit

-----------------------------
!curl http://dl.dropbox.com/u/308664/hadcrut3v.txt > temp.txt

(wait for the download) 

capt file close tmp
file open tmp using temp.dct, write replace
file write tmp "infix dictionary using temp.txt {" _n "37 lines" _n
file write tmp "	1:" _n  "year   1-6" _n "month  7-12" _n
loc v 0
forv i=2/37 {
	file write tmp "`i':" _n
	loc f 1
	loc n 11
	forv j=1/72 {
		loc v = `v' + 1
		file write tmp "v`v' `f'-`n'" _n 
		loc f = `f' + 11
		loc n = `n' + 11
	}
}
file write tmp "}" _n
file close tmp

clear all
infix using temp.dct in 1/1931
mvdecode v*, mv(-1.00e+30)
su
--------------------------------


		

Kit Baum   |   Boston College Economics & DIW Berlin   |   http://ideas.repec.org/e/pba1.html
                             An Introduction to Stata Programming  |   http://www.stata-press.com/books/isp.html
  An Introduction to Modern Econometrics Using Stata  |   http://www.stata-press.com/books/imeus.html

On Jan 3, 2012, at 10:33 PM, Jose Ramon Albert wrote:

> ---------- Forwarded message ----------
> From: Jose Ramon Albert <[email protected]>
> Date: Wed, Jan 4, 2012 at 9:01 AM
> Subject: need help on reading big ASCII file to make a panel data set
> on temperature across geographic coordinates
> To: [email protected]
> 
> 
> i have a big TEMPERATURE data set
> 
> http://dl.dropbox.com/u/308664/hadcrut3v.txt
> 
> that pertains to temperature data for 2592 = 36 x 72 geographic
> coordinates across
> months from the years 1850 to 2010
> 
> for year = 1850 to 2010
>  for month = 1 to 12
>   format(2i6) year, month
>   for row = 1 to 36 (85-90N,80-85N,75-80N,...75-80S,80-85S,85-90S)
>    format(72(e10.3,1x)) 180W-175W,175W-170W,...,175-180E
> 
> the first row in the txt file prior to the actual data,
> 
>  1850     1     1    36 rows     72 columns. Missing=-1.000e+30
> 
> announces that the data ff it are for month 1 (jan) 1850 which i
> would like to ignore, and then the next line
> again has a description
> 
>  1850     2     1    36 rows     72 columns. Missing=-1.000e+30
> 
> which then is succeded by the feb 1850 data... and this goes on and on
> to Dec 2010.
> 
> i need to construct a panel database that will look like this:
> 
> Year Month  Var1 Var 2 ... Var2592
> 1850   1       DATA read from file
> 1850   2 ..... (Data read from file)
> 
> etc.
> 
> 
> can anyone help me out ? i understand this may need the use of
> the file, read command, but i have not used this before...
> grateful for help.
> 
> please respond directly to [email protected]
> 
> thanks
> 
> Jose Ramon Albert
> Manila, Philippines


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index