Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: dynamic line execution in mata


From   Andrew Maurer <Andrew.Maurer@qrm.com>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: dynamic line execution in mata
Date   Mon, 10 Feb 2014 19:22:53 +0000

Thanks for the response, Nick. I looked into pointers and have been able to make use of them. I'll give the background of the problem. I would be very interested to hear if anyone has thoughts on the efficiency of the code I have so far (see bottom of post).

I am writing a Stata program, saveif, that will save a subset of observations of a dataset to a file. One method to accomplish this would be to do something like:
preserve
keep if...
save...
restore

However, for large datasets (eg 20gb) and few observations to be saved (eg - a few mb of outliers), I expect that the preserve/restore method is grossly inefficient, since it involves writing the entire dataset from memory to hard-disk, then reading it back from hard-disk to memory.

An alternative method to accomplish the task would be to somewhat manually "file write" the individual observations to a file, without having to clear and load back the dataset from memory. I have a nearly complete example here, where there is one part that has been hard-coded to the specific example of gnp96.dta. The code is still somewhat rough.

Thank you,
Andrew Maurer 



* Want to write a program that will save a set of observations into a dataset
mata: mata clear
clear all

cap program drop saveif
program define saveif
	syntax varlist [if] [in] using/, [replace]
	putmata `varlist' `if' `in', view
	
	* put a row vector called varnames to mata
	forval i = 1/`: word count `varlist'' {
		if `i' == 1 {
			mata: varnames = "`: word `i' of `varlist''"
			mata: vartypes = "`: type `: word `i' of `varlist'''"
			mata: varpointers = &`: word `i' of `varlist'' // pointers
		}
		else {
			mata: varnames = varnames,"`: word `i' of `varlist''"
			mata: vartypes = vartypes,"`: type `: word `i' of `varlist'''"
			mata: varpointers = varpointers,&`: word `i' of `varlist'' // pointers
		}
	}
	* save vector of varnames to file
	cap confirm new file "`using'"
	if _rc != 0 {
		di as error "`using' exists. replacing"
		rm "`using'"
	}
	mata: fh = fopen("`using'", "w")
	mata: fputmatrix(fh, varnames)
	mata: fputmatrix(fh, vartypes)
	mata: fputmatrix(fh, varpointers)
		
	* write observations of each variable to file
	forval i = 1/`: word count `varlist'' {
		mata: fputmatrix(fh, `: word `i' of `varlist'')
	}
	
	mata: fclose(fh)
end	


capture mata mata drop recover_from_saveif()
mata:
void recover_from_saveif(string fileloc)
{

	fh = fopen(fileloc, "r")
	varnames = fgetmatrix(fh) 
	vartypes = fgetmatrix(fh) 
	varpointers = fgetmatrix(fh) 
	// ----- hard coded part!! try to get this into loop
	date = fgetmatrix(fh) 
	gnp96 = fgetmatrix(fh) 
	// -------------------------------------------------
	fclose(fh)
	varcount = cols(varnames)

	// ------- this loop not working yet. need to figure out syntax
	// foreach var of varnames, read var from file to mata
	for (i=1; i<=varcount;i++) {
	  // varnames[1,i] = fgetmatrix(fh) 
	}
	// -------------------------------------------------
	
	// foreach var of varnames, load var into stata with correct variable type
	for (i=1; i<=varcount;i++) {
		thisvarname = varnames[1,i] // eg contains "date"
		thisvartype = vartypes[1,i] // eg contains "int"
		thisvar = varpointers[1,i] // eg pointer to date vector
		if (i == 1) st_addobs(rows(*thisvar))
		st_store(., st_addvar(thisvartype,thisvarname),*thisvar) 
	}

}
end

cap program drop recover_from_saveif
program define recover_from_saveif
	syntax using/, [replace]
	
	mata: recover_from_saveif("`using'")
	
end


sysuse gnp96.dta, clear

saveif * in 1/5 using test5.txt

clear
recover_from_saveif using test5.txt














-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
Sent: Monday, February 10, 2014 11:59 AM
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: dynamic line execution in mata

You are presuming that such a thing exists.

In essence, Mata has no direct equivalent of macro substitution.

Sometimes, the way to solve (similar) problems is by direct manipulation of strings. That is the theme of

SJ-11-2 pr0052  . . . . Stata tip 100: Mata and the case of the missing macros
        . . . . . . . . . . . . . . . . . . . . . . . . W. Gould and N. J. Cox
        Q2/11   SJ 11(2):323--324                                (no commands)
        tip showing how to do the equivalent of Stata's macro
        substitution in Mata

Sometimes, using pointers is the answer.

In this case, I'd guess that you want the Mata equivalent of some Stata operation and that there's a Mata way of doing it, but I would rather hear whether that is so than try to guess what the underlying problem is.

Nick
njcoxstata@gmail.com


On 10 February 2014 17:38, Andrew Maurer <Andrew.Maurer@qrm.com> wrote:
> Hi Statalist,
>
> I am trying to find Mata's equivalent of Stata's macro expansion functionality. In the below example, I first define an object thisvar as the string "date" and I define the object date as the column vector 1 \ 2 \ 3 \ 4 \ 5. How can I return the contents of the "date" object by only referencing "thisvar"?
>
> In the line, rows( thisvar ), thisvar is simply the 1x1 matrix containing the string "date", so rows( thisvar ) returns: 1. What I am looking for is something like rows( `=thisvar' ), so as to return 5 rather than 1.
>
> ********* begin example *********
>
> mata
>
> i = 1
> date = 1 \ 2 \ 3 \ 4 \ 5
> varnames = "date", "price"
> thisvar = varnames[1,i]
> rows( thisvar ) // output: 1
> rows( date ) // output: 5
>
> end
>
> ********* end example ***********
>
> Thank you,
> Andrew Maurer
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index