Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: SOLUTION to reading big ASCII file to make a panel data set on temperature across geographic coordinates


From   Jose Ramon Albert <jrgalbert@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: SOLUTION to reading big ASCII file to make a panel data set on temperature across geographic coordinates
Date   Wed, 4 Jan 2012 21:48:13 +0800

my thanks to everyone who gave suggestions.... each had their usefulness, but
i found kit's suggestion below most helpful...

after his suggestion, i run the ff set of commands below to rename the
temperature variables
and crank out an average annual temperature reading for each
geographic coordinate.

forval i=1/36 {
 forval j=1/72 {
   loc k = (`i'-1)*72 + `j'
   rename v`k' temp`i'_`j'
      }
	 } 	
collapse (mean) temp1_1- temp36_72, by(year)
save globaltemp, replace

thanks again.

jose ramon albert
manila, philippines
----------------------


---------- Forwarded message ----------
From: Christopher Baum <kit.baum@bc.edu>
Date: Wed, Jan 4, 2012 at 8:47 PM
Subject: Re: need help on reading big ASCII file to make a panel data
set on temperature across geographic coordinates
To: Jose Ramon Albert <jrgalbert@gmail.com>
Cc: "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>


Jose,

You're right, you need the file command, but not to read the data!
Rather, the file command allows me to write an appropriate dictionary
file with use with -infix-. That command will allow me to specify that
an observation is contained in a block of 37 lines in the file, but I
then have to indicate where the variables appear on each record. A
little bit
of housekeeping arithmetic takes care of that.

For some reason the very last block of records shows as incomplete,
perhaps because there is a missing end-of-line on the very last line.
I have dropped that (Dec 2010) obs. You could probably recover it by
adding a blank line to the file and telling it to read 1932 obs.

Kit

-----------------------------
!curl http://dl.dropbox.com/u/308664/hadcrut3v.txt > temp.txt

(wait for the download)

capt file close tmp
file open tmp using temp.dct, write replace
file write tmp "infix dictionary using temp.txt {" _n "37 lines" _n
file write tmp "        1:" _n  "year   1-6" _n "month  7-12" _n
loc v 0
forv i=2/37 {
       file write tmp "`i':" _n
       loc f 1
       loc n 11
       forv j=1/72 {
               loc v = `v' + 1
               file write tmp "v`v' `f'-`n'" _n
               loc f = `f' + 11
               loc n = `n' + 11
       }
}
file write tmp "}" _n
file close tmp

clear all
infix using temp.dct in 1/1931
mvdecode v*, mv(-1.00e+30)
su
--------------------------------




Kit Baum   |   Boston College Economics & DIW Berlin   |
http://ideas.repec.org/e/pba1.html
                            An Introduction to Stata Programming  |
http://www.stata-press.com/books/isp.html
 An Introduction to Modern Econometrics Using Stata  |
http://www.stata-press.com/books/imeus.html

On Jan 3, 2012, at 10:33 PM, Jose Ramon Albert wrote:

> ---------- Forwarded message ----------
> From: Jose Ramon Albert <jrgalbert@gmail.com>
> Date: Wed, Jan 4, 2012 at 9:01 AM
> Subject: need help on reading big ASCII file to make a panel data set
> on temperature across geographic coordinates
> To: statalist@hsphsun2.harvard.edu
>
>
> i have a big TEMPERATURE data set
>
> http://dl.dropbox.com/u/308664/hadcrut3v.txt
>
> that pertains to temperature data for 2592 = 36 x 72 geographic
> coordinates across
> months from the years 1850 to 2010
>
> for year = 1850 to 2010
>  for month = 1 to 12
>   format(2i6) year, month
>   for row = 1 to 36 (85-90N,80-85N,75-80N,...75-80S,80-85S,85-90S)
>    format(72(e10.3,1x)) 180W-175W,175W-170W,...,175-180E
>
> the first row in the txt file prior to the actual data,
>
>  1850     1     1    36 rows     72 columns. Missing=-1.000e+30
>
> announces that the data ff it are for month 1 (jan) 1850 which i
> would like to ignore, and then the next line
> again has a description
>
>  1850     2     1    36 rows     72 columns. Missing=-1.000e+30
>
> which then is succeded by the feb 1850 data... and this goes on and on
> to Dec 2010.
>
> i need to construct a panel database that will look like this:
>
> Year Month  Var1 Var 2 ... Var2592
> 1850   1       DATA read from file
> 1850   2 ..... (Data read from file)
>
> etc.
>
>
> can anyone help me out ? i understand this may need the use of
> the file, read command, but i have not used this before...
> grateful for help.
>
> please respond directly to jrgalbert@gmail.com
>
> thanks
>
> Jose Ramon Albert
> Manila, Philippines

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index