[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Replicating a sas loop in stata results in very slow computation time...

From	"Joseph Coveney" <[email protected]>
To	"Statalist" <[email protected]>
Subject	Re: st: Replicating a sas loop in stata results in very slow computation time...
Date	Fri, 11 Apr 2008 11:15:13 +0900

PatrickT wrote:

I am trying to replicate a loop that runs well and fast in SAS, but is very
slow in STATA on my machine.

Perhaps someone will spot some awkward piece of code in the following?

many thanks for your attention,

[email protected]


*** STATA PROGRAM
set memory 700m  ** couldn't do better than 700M
version 10
use r2f3_glm.dta, clear

local i 94
while `i' ~= 103 {
xi: logit yr00 i.age i.educ i.educ*age i.educ*age2 i.educ*age3
i.educ*age4
[pweight=ihwt] if year~=`i'
predict pyr00 if e(sample), p
gen rw00=ihwt*pyr00/(1-pyr00)
sum yr00 pyr00 rw00 if year~=100
drop pyr00 rw00
local i = `i'+ 1
}

exit


*** SAS PROGRAM

proc logistic data=one(where=(year ne 100)) descending;
class yr00 educ agedum;
model yr00=agedum educ educ*age educ*age2 educ*age3 educ*age4;
weight ihwt;
output out=temp1 predicted=pyr00;
run;

data temp1;
set temp1;
rw00=ihwt*pyr00/(1-pyr00);

--------------------------------------------------------------------------------

The Stata code doesn't do what your SAS code is doing, and there's no
by-processing (looping) in your SAS code, so it isn't doing what you say you
want to do.  Perhaps the SAS code is incomplete . . .

The first snippet of Stata below more closely mimics what the SAS code you
show does, I believe.  It should run as fast as the SAS.

The second snippet below I believe does what you seem to want.  For
efficiency (both memory and postestimation operations) it subsets the data
first, and should loop much faster.  Check for typos &c--I don't have your
dataset, so I can't debug it.  Also, I don't quite follow why you're
multiplying the predicted odds by the frequency weight and then averaging.
(Note that the SAS code doesn't do any summary statistics at all.)  Below, I
have Stata averaging the odds weighted by the frequency weight.  If that's
not what you want, you can change it.

Joseph Coveney

*
*  Replicates what the SAS code does
*
use if year != 100 using r2f3_glm /* r2f3_glm same as one.sas7bdat? */
xi i.educ*age i.educ*age2 i.educ*age3 i.educ*age4 ///
 i.agedum // Note agedum is in SAS code, but not in your Stata code
logit yr00 _I* age [fw = ihwt] // SAS doesn't have any subsetting here
predict pyr00, pr // Likewise
generate float rw00 = ihwt * pyr00 / (1 - pyr00) // No summarization
*
* Does what you want
*
tempfile one
use if year != 100 using r2f3_glm, clear
xi i.educ*age i.educ*age2 i.educ*age3 i.educ*age4 i.agedum
* Following two lines only necessary of yr00 is not already 0/1
drop if missing(yr00)
replace yr00 = yr00 != 0
* End of possibly unnecessary transformation
* These next few lines should make for memory efficiency, too
keep ihwt yr00 year _I* age age2 age3 age4
compress
foreach var of varlist _all {
   drop if missing(`var')
}
save `one' // This is your working dataset
* Loop begins here
forvalues i = 94/102 { // Limit is < 103, right?
   use if year != `i' using `one', clear
   logit yr00 _I* age* [fweight = ihwt]
   predict pyr00, xb
   replace pyr00 = exp(pyr00)
   summarize pyr00 [fweight = ihwt], meanonly
   display r(mean) // You don't seem to want to save anything . . .
}
exit


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: ci question
Next by Date: st: Update to -outreg2-
Previous by thread: st: Replicating a sas loop in stata results in very slow computation time...
Next by thread: st: Updated version of -zipsave-
Index(es):
- Date
- Thread