Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: why -recode-takes longer than -replace-


From   "Martin Weiss" <martin.weiss1@gmx.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: why -recode-takes longer than -replace-
Date   Sun, 27 Jun 2010 23:09:24 +0200

<>

So the speed advantage reported by Amanda works out to 89.5% for -replace-.
It can be increased to 91.5% by using -inlist()- for the qualifier:

***********
//imitate 10.1 SE
vers 10.1
set processors 1


//get data
clear*
set mem 1G
loc nobs 100000

set obs `nobs'

qui foreach var of newlist var1-var130{
	gen `var'=irecode(runiform(),0,.1,.4,1)+6
}

timer clear
timer on 1
qui recode _all (8=.) (9=.)
timer off 1

//get data anew
drop _all
set obs `nobs'

qui foreach var of newlist var1-var130{
	gen `var'=irecode(runiform(),0,.1,.4,1)+6
}

//timer clear
timer on 2
qui foreach v of varlist *{
	replace `v'=. if `v'==8 | `v'==9
}
timer off 2


//get data anew
drop _all
set obs `nobs'

qui foreach var of newlist var1-var130{
	gen `var'=irecode(runiform(),0,.1,.4,1)+6
}

//timer clear
timer on 3
qui foreach v of varlist *{
	replace `v'=. if inlist(`v',8,9)
}
timer off 3

qui timer list
di in r "recode took: " r(t1) _n /* 
*/ "replace took: " r(t2) ", " %3.1fc r(t2)*100/r(t1) "% of recode timing"
_n /* 
*/ "replace with inlist() took: " r(t3) ", " %3.1fc r(t3)*100/r(t1) "% of
recode timing"
***********


HTH
Martin


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Amanda Fu
Sent: Sonntag, 27. Juni 2010 22:11
To: statalist@hsphsun2.harvard.edu
Subject: st:why -recode-takes longer than -replace-

Dear Statalisters,

I am just curious about a fact noticeable when I use -recode- (1 as
following) or -replace- (2 as following)  to recode all the variables
in a data set (size :  52,996,616) using Stata 10.1 SE: the recode
command  takes much longer than replace. I was wondering what causes
the speed difference and if the speed difference implies for large
data set it would be better to use replace instead of recode.

1.  slower
.recode _all (8=.) (9=.)

2. faster
foreach v of varlist * {
.replace `v'=. if `v'==8 | `v'==9
}
end

Thanks for your time!

Sincerely,
Amanda
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index