Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: dynamic line execution in mata


From   Nick Cox <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: dynamic line execution in mata
Date   Mon, 10 Feb 2014 19:39:49 +0000

Phil Schumm pointed to associative arrays in a later answer.

I wrote -savesome- (SSC) to what you want to do. It's not fast. Its
original version from 2001 long predates Mata. It's a (lousy)
benchmark for you.

I wouldn't call Mata line by line but that in itself is probably trivial.

Nick
[email protected]


On 10 February 2014 19:22, Andrew Maurer <[email protected]> wrote:
> Thanks for the response, Nick. I looked into pointers and have been able to make use of them. I'll give the background of the problem. I would be very interested to hear if anyone has thoughts on the efficiency of the code I have so far (see bottom of post).
>
> I am writing a Stata program, saveif, that will save a subset of observations of a dataset to a file. One method to accomplish this would be to do something like:
> preserve
> keep if...
> save...
> restore
>
> However, for large datasets (eg 20gb) and few observations to be saved (eg - a few mb of outliers), I expect that the preserve/restore method is grossly inefficient, since it involves writing the entire dataset from memory to hard-disk, then reading it back from hard-disk to memory.
>
> An alternative method to accomplish the task would be to somewhat manually "file write" the individual observations to a file, without having to clear and load back the dataset from memory. I have a nearly complete example here, where there is one part that has been hard-coded to the specific example of gnp96.dta. The code is still somewhat rough.
>
> Thank you,
> Andrew Maurer
>
>
>
> * Want to write a program that will save a set of observations into a dataset
> mata: mata clear
> clear all
>
> cap program drop saveif
> program define saveif
>         syntax varlist [if] [in] using/, [replace]
>         putmata `varlist' `if' `in', view
>
>         * put a row vector called varnames to mata
>         forval i = 1/`: word count `varlist'' {
>                 if `i' == 1 {
>                         mata: varnames = "`: word `i' of `varlist''"
>                         mata: vartypes = "`: type `: word `i' of `varlist'''"
>                         mata: varpointers = &`: word `i' of `varlist'' // pointers
>                 }
>                 else {
>                         mata: varnames = varnames,"`: word `i' of `varlist''"
>                         mata: vartypes = vartypes,"`: type `: word `i' of `varlist'''"
>                         mata: varpointers = varpointers,&`: word `i' of `varlist'' // pointers
>                 }
>         }
>         * save vector of varnames to file
>         cap confirm new file "`using'"
>         if _rc != 0 {
>                 di as error "`using' exists. replacing"
>                 rm "`using'"
>         }
>         mata: fh = fopen("`using'", "w")
>         mata: fputmatrix(fh, varnames)
>         mata: fputmatrix(fh, vartypes)
>         mata: fputmatrix(fh, varpointers)
>
>         * write observations of each variable to file
>         forval i = 1/`: word count `varlist'' {
>                 mata: fputmatrix(fh, `: word `i' of `varlist'')
>         }
>
>         mata: fclose(fh)
> end
>
>
> capture mata mata drop recover_from_saveif()
> mata:
> void recover_from_saveif(string fileloc)
> {
>
>         fh = fopen(fileloc, "r")
>         varnames = fgetmatrix(fh)
>         vartypes = fgetmatrix(fh)
>         varpointers = fgetmatrix(fh)
>         // ----- hard coded part!! try to get this into loop
>         date = fgetmatrix(fh)
>         gnp96 = fgetmatrix(fh)
>         // -------------------------------------------------
>         fclose(fh)
>         varcount = cols(varnames)
>
>         // ------- this loop not working yet. need to figure out syntax
>         // foreach var of varnames, read var from file to mata
>         for (i=1; i<=varcount;i++) {
>           // varnames[1,i] = fgetmatrix(fh)
>         }
>         // -------------------------------------------------
>
>         // foreach var of varnames, load var into stata with correct variable type
>         for (i=1; i<=varcount;i++) {
>                 thisvarname = varnames[1,i] // eg contains "date"
>                 thisvartype = vartypes[1,i] // eg contains "int"
>                 thisvar = varpointers[1,i] // eg pointer to date vector
>                 if (i == 1) st_addobs(rows(*thisvar))
>                 st_store(., st_addvar(thisvartype,thisvarname),*thisvar)
>         }
>
> }
> end
>
> cap program drop recover_from_saveif
> program define recover_from_saveif
>         syntax using/, [replace]
>
>         mata: recover_from_saveif("`using'")
>
> end
>
>
> sysuse gnp96.dta, clear
>
> saveif * in 1/5 using test5.txt
>
> clear
> recover_from_saveif using test5.txt
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
> Sent: Monday, February 10, 2014 11:59 AM
> To: [email protected]
> Subject: Re: st: dynamic line execution in mata
>
> You are presuming that such a thing exists.
>
> In essence, Mata has no direct equivalent of macro substitution.
>
> Sometimes, the way to solve (similar) problems is by direct manipulation of strings. That is the theme of
>
> SJ-11-2 pr0052  . . . . Stata tip 100: Mata and the case of the missing macros
>         . . . . . . . . . . . . . . . . . . . . . . . . W. Gould and N. J. Cox
>         Q2/11   SJ 11(2):323--324                                (no commands)
>         tip showing how to do the equivalent of Stata's macro
>         substitution in Mata
>
> Sometimes, using pointers is the answer.
>
> In this case, I'd guess that you want the Mata equivalent of some Stata operation and that there's a Mata way of doing it, but I would rather hear whether that is so than try to guess what the underlying problem is.
>
> Nick
> [email protected]
>
>
> On 10 February 2014 17:38, Andrew Maurer <[email protected]> wrote:
>> Hi Statalist,
>>
>> I am trying to find Mata's equivalent of Stata's macro expansion functionality. In the below example, I first define an object thisvar as the string "date" and I define the object date as the column vector 1 \ 2 \ 3 \ 4 \ 5. How can I return the contents of the "date" object by only referencing "thisvar"?
>>
>> In the line, rows( thisvar ), thisvar is simply the 1x1 matrix containing the string "date", so rows( thisvar ) returns: 1. What I am looking for is something like rows( `=thisvar' ), so as to return 5 rather than 1.
>>
>> ********* begin example *********
>>
>> mata
>>
>> i = 1
>> date = 1 \ 2 \ 3 \ 4 \ 5
>> varnames = "date", "price"
>> thisvar = varnames[1,i]
>> rows( thisvar ) // output: 1
>> rows( date ) // output: 5
>>
>> end
>>
>> ********* end example ***********
>>
>> Thank you,
>> Andrew Maurer
>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index