Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Gordon Hughes <G.A.Hughes@ed.ac.uk> |
To | statalist@hsphsun2.harvard.edu |
Subject | st: Re: need help on reading big ASCII file to make a panel data set on temperature across geographic coordinates |
Date | Wed, 04 Jan 2012 11:00:22 +0000 |
Once you have done this, the code below will do almost everything you want. [I wrote this to handle a version of the HadCRUT3 dataset up to mid-2010.] The edited version of the original data is stored in the file "hadcrut3.txt". The panel_id and time_id variables are defined at the end of the code. Replace the command -save- by -saveold- if you want to save the data as a State 9 dataset.
Gordon Hughes g.a.hughes@ed.ac.uk ======== capture cd "g:\CRU_Data"; capture log close; log using "hadcrut3_grid_data.log", replace; infile year month unit nrows ncolumns missval row1_1-row1_72 row2_1-row2_72 row3_1-row3_72 row4_1-row4_72 row5_1-row5_72 row6_1-row6_72 row7_1-row7_72 row8_1-row8_72 row9_1-row9_72 row10_1-row10_72 row11_1-row11_72 row12_1-row12_72 row13_1-row13_72 row14_1-row14_72 row15_1-row15_72 row16_1-row16_72 row17_1-row17_72 row18_1-row18_72 row19_1-row19_72 row20_1-row20_72 row21_1-row21_72 row22_1-row22_72 row23_1-row23_72 row24_1-row24_72 row25_1-row25_72 row26_1-row26_72 row27_1-row27_72 row28_1-row28_72 row29_1-row29_72 row30_1-row30_72 row31_1-row31_72 row32_1-row32_72 row33_1-row33_72 row34_1-row34_72 row35_1-row35_72 row36_1-row36_72 using hadcrut3.txt; drop unit nrows ncolumns missval; compress;reshape long row1_ row2_ row3_ row4_ row5_ row6_ row7_ row8_ row9_ row10_ row11_ row12_ row13_ row14_ row15_ row16_ row17_ row18_ row19_ row20_ row21_ row22_ row23_ row24_ row25_ row26_ row27_ row28_ row29_ row30_ row31_ row32_ row33_ row34_ row35_ row36_, i(year month) j(hgrid);
forvalues n=1/36 {; rename row`n'_ dtemp`n'; }; reshape long dtemp, i(year month hgrid) j(vgrid); replace dtemp=. if dtemp <= -1000000; sort vgrid hgrid year month; by vgrid hgrid: egen cell_nobs=count(dtemp); drop if cell_nobs <= 0; * panel_id: 5 deg grid cells starting from 90-85N, 180-175W = 1; gen panel_id=(vgrid-1)*72+hgrid; * time-_id: months starting with Jan 1850 = 1; gen time_id=(year-1850)*12+month; sort panel_id year time_id; compress; describe; save "hadcrut3_grid_data.dta", replace; ================== * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/