Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Tirthankar Chakravarty <tirthankar.chakravarty@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Mata Data Structure or "variable" variable names for timeseries computations |

Date |
Wed, 1 Aug 2012 02:23:32 -0700 |

Matt, You presumably have panel data located in multiple files. My solution would be to append the datasets and use Mata's excellent understanding of panel structure via the -panelsetup()- function. Instead of submatrices based on panel identifiers, you can use submatrices based on the time variable. I have written you a pretty general ado file which with modifications can be made to do virtually anything in the near neighbourhood of your stated requirements. It asks you for the start and end years and which variables from the read in files you would like to take differences of, and then it saves out the matrices of differences to binary format files which can be read in for later use. 1) First, the ado file *------------------------------------------------- matdiff.ado cap program drop matdiff program matdiff, rclass version 12 syntax varlist(min=1 numeric), /// [STartyr(integer 1995) ENdyr(integer 2010)] /// ID(varname) /// Time(varname) // put the variable names together local allvars `id' `time' `varlist' // read the files in and append use "file_`startyr'", clear forvalues year=`startyr'/`endyr' { if("file_`year'"~="file_`startyr'") { append using file_`year' } } // call the Mata function mata: a=fnCalcDiffMat(st_local("allvars"), /// strtoreal(st_local("startyr")), /// strtoreal(st_local("endyr"))) end version 12 set matastrict off mata // define a class to store the returned matrices struct stMattData { real matrix mDiff real scalar yearCurrent real scalar yearPrevious } // Mata function struct stMattData colvector function /// fnCalcDiffMat(string scalar varlist, /// real scalar startyr, real scalar endyr) { // declare the vector of structures struct stMattData colvector stDiff real scalar fh // the file handle string scalar sFileName // filename string // create some views st_view(mV=., ., varlist, .) mV=sort(mV, (2, 1)) // sort by id within year // setup as panel st_subview(mV1=., mV, ., (3..length(tokens(varlist)))) // pull variables into subview st_subview(mV2=., mV, ., 2) // year variable is "panel" stInfo = panelsetup(mV2, 1) // setup as panel // instantiate classes stDiff= J(rows(stInfo)-1, 1, stMattData()) // fill 'er up for(i=1; i<rows(stInfo); i++) { mCurrent = panelsubmatrix(mV1, i+1, stInfo) mPrevious = panelsubmatrix(mV1, i, stInfo) // do the computation stDiff[i].mDiff = mCurrent - mPrevious stDiff[i].yearCurrent = mV2[stInfo[i+1,1],1] stDiff[i].yearPrevious = mV2[stInfo[i,1],1] sFileName = invtokens(("file", strofreal(stDiff[i].yearCurrent) , strofreal(stDiff[i].yearPrevious)),"_") // write the file to disk fh = fopen(sFileName, "rw") fputmatrix(fh, stDiff[i].mDiff) fclose(fh) } return(stDiff) // no use currently, but handy } end *------------------------------------------------- matdiff.ado 2) Next, a script that implements the ado file *---------------------------------- matdiff_script.do clear* // start year and endyear local startyr 1995 local endyr 2010 // generate dummy data forvalues year=`startyr'/`endyr' { drop _all set obs 10 drawnorm myvar1-myvar5 g year = `year' g id=_n save file_`year', replace } run matdiff.ado // call the program matdiff myvar1-myvar5, startyr(1996) /// endyr(2005) id(id) time(year) *---------------------------------- matdiff_script.do Much simpler solutions are possible without invoking Mata, but this program should be extensible. T On Tue, Jul 31, 2012 at 9:49 PM, Matthew McKay <matthew.mckay@anu.edu.au> wrote: > Dear StataList, > > I am trying to compute differences in sets of matrices matrices using MATA > for a time series 1995 to 2010. > I am producing a mata function to compute the differences for each year > after I load in ALL my yearly data matrices. > My Issue is in Step#2 is that I want to use a similar methodology to local > macro's in do files. (which can't be done) > > How can I pass part of a variable name (like Year) into the function and > then perform a set of computations over the data that is loaded in MATA? > In MATA how can you cycle through different variables (substituting in > different year append) and compute a new set of matrices? > Should I be constructing a Struct that contains a Data Matrix and Time > Series Indicator? > > I understand the MATA function is compiled and therefore local macro > substitution can't be done ... but was wondering if anyone else has an > elegant solution to reference different variables in MATA by changing a > component of the variablename (knowing the variable is defined in memory). > > Many Thanks, > Matthew > > Step #1: > ** > ** Import ALL MCP Matrix > ** > local Years = "1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 > 2007 2008 2009 2010" > clear all > local END = "end" > foreach year of local Years { > use [file_`year'.dta], clear // Load MCP Matrices into > Mata Memory // > drop ReporterISO3C > mata > Mcp_`year' = st_data(., .) > `END' > } > > Step#2: > ** MATA FUNCTION ** > > clear all > mata > void calcdiffvectors(real scalar StartYear, real scalar EndYear) { > for(year = StartYear; year < EndYear; year++) { > !!!!!!!!! local NextYear = `year' + 1 !!!!!!!!!!!!! > !!!!!!!!! Mcp_`year'_`NextYear' = Mcp_`year' :- Mcp_`NextYear' > !!!!!!!!!! > } > } > end > > Step #3: > mata: calcdiffvectors(1995, 2010) > // I can then retrieve the difference vectors using get mata etc. // > > -- Tirthankar Chakravarty tchakravarty@ucsd.edu tirthankar.chakravarty@gmail.com * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: Comparing coefficients across sub-samples** - Next by Date:
**Re: st: Comparing coefficients across sub-samples** - Previous by thread:
**RE: st: differentiating between groups of records with same date** - Next by thread:
**Re: st: Mata Data Structure or "variable" variable names for timeseries computations** - Index(es):