Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Is there a way and a seed reason to "double" preserve data?


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Is there a way and a seed reason to "double" preserve data?
Date   Mon, 4 Apr 2011 17:52:30 +0100

I don't work with that kind of set-up, but I would not repeatedly
-preserve- and -restore- unless I had to.
Others may have Unix-pertinent advice.

Nick

On Mon, Apr 4, 2011 at 5:45 PM, Oliver Jones
<ojones@wiwi.uni-bielefeld.de> wrote:
> Hi Nick,
> thanks for the quick answer.
>
> The data sets contain up to 60 thousand time series each between 25 and 17
> periods long, which amounts to an file size of approx 20 MB. The hardware
> is an 8 core Unix Server and I use Stata-MP version 11.
>
> Since I would have to save and use the data more than 60,000 * 5
> (inner loop * outer loop) I thought it might be good to look a little bit
> at speed.
>
> Best
> Oliver
>
>
>
> Am 04.04.2011 18:27, schrieb Nick Cox:
>>
>> Much depends on the size of dataset, your hardware set-up, which
>> Stata. You can waste more time trying to optimise code like this than
>> you will gain!
>>
>> As you are concerned with speed, don't use -egen- to produce a new
>> variable which is a constant. Use -su, meanonly- and save r(mean). Or
>> get the mean in Mata.
>>
>> Alternatively, do use -egen- with -by:-.
>>
>> Nick
>>
>> On Mon, Apr 4, 2011 at 5:21 PM, Oliver Jones
>> <ojones@wiwi.uni-bielefeld.de>  wrote:
>>>
>>> Hi all,
>>>
>>> maybe the more important question is the one regarding the speed of
>>> preserve restore vs. save use.
>>> If it makes no difference then I can save my data and use it later.
>>>
>>> But if there is a improvement in execution time by using preserve and
>>> restore, then I would like know if there is a way to "double preserve"
>>> my data?
>>>
>>> I want to double preserve my data, to be able to make forecasts for
>>> different individuals and then save the forecasts in a mata matrix.
>>>
>>> In a two step forval construct I've got something like
>>>
>>> *********** Begin example **************
>>> sysuse xtline1, clear
>>> bysort person: gen int t = _n
>>> forval person = 1/3 {
>>>        dis _n "Starting the analysis of Person `person'" _n
>>>
>>>        preserve
>>>
>>>        keep if person == `person'
>>>
>>>        forval last_T = 360/365 { // here would follow the forecast code
>>>                preserve
>>>                egen mean_calories = mean(calories)
>>>                keep in L/L
>>>                lacal mean_person_`person'_days_`last_T' =
>>>  mean_calories[1]
>>>                restore
>>>        }
>>>        restore
>>> }
>>> *********** end example **************
>>>
>>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index