Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Why Is My Conditional Probability Program Taking So Long to Run?


From   "Brad Wright" <[email protected]>
To   <[email protected]>
Subject   st: Why Is My Conditional Probability Program Taking So Long to Run?
Date   Sun, 16 Jan 2011 19:15:19 -0500

I was graciously provided some code to calculate conditional probabilities following a fixed effects logistic regression where more than 1 positive outcome is expected. I know that the code works in theory, but it has been running for more than 55 straight hours without finishing. It is not frozen, just seems incredibly slow and is bogged down at the "calculating sequence" stage. Any insights on why it is taking so long or how to make the code more efficient? Please note that the actual maximum number of board members is not 7, but 29. I provide the abbreviated code here.

/* i assume that each board and each member is numbered sequentially (e.g. board 1-24, member 1-n in each board) */
qui predict xb, xb
egen numbersuccesses = sum(officer), by(board)
egen Ningroup =max(person), by(board)
egen simpledenom = sum(exp(xb)), by(board)
qui sum numbersuccesses
local kmax = r(max)
qui sum person
local Tmax = r(max)
qui gen f00 = 1
qui gen f01 = 0
qui gen denom = .
qui sort board person
forvalues t = 1 / `Tmax' {
qui gen f`t'0 = 1
forvalues k = 1 / `kmax' {
* di "computing T = `t' and k = `k' "
local t1 = `t'-1
local k1 = `k'-1
if `t'<`k' {
qui gen f`t'`k' = 0
}
else {
qui by board: gen f`t'`k' = f`t1'`k' + f`t1'`k1' * exp(xb[`t'])
}
/* find the match for this group */
qui replace denom = f`t'`k' if numbersuccess == `k' & Ningroup==`t'
}
}
gen expand = comb(Ningroup , number)
keep person board xb expand numbersuccess
reshape i board
reshape j person
reshape xij xb
reshape xi expand numbersuccess
reshape wide
/* assuming seven is the maximum number of members per board */
forvalues p = 1 / 7 {
gen y`p'=.
}
gen numerator = 0
gen run = 0
save _temp,replace
capture erase _append.dta
forvalues i1 = 0 / 1 {
forvalues i2 = 0 / 1 {
forvalues i3 = 0 / 1 {
forvalues i4 = 0 / 1 {
forvalues i5 = 0 / 1 {
forvalues i6 = 0 / 1 {
forvalues i7 = 0 / 1 {
/* if have more than 7 need to add more like this -- also be sure to close loop below */
use _temp,clear
di "Calculating sequence: `i1' `i2' `i3' `i4' `i5' `i6' `i7'"
/* again can change 7*/
forvalues k = 1 / 7 {
qui replace numerator = xb`k'*`i`k'' + numerator if xb`k'<.
qui replace y`k' = `i`k'' if xb`k'<.
qui replace run = run + `i`k'' if xb`k'<.
}
qui drop if run~=numbersuccess
qui keep board y* numerator
qui capture append using _append
qui save _append,replace
}
}
}
}
}
}
}
/* closing the loop over i7 */
dups, drop
egen denom = sum(exp(numerator)), by(board)
gen prob = exp(numerator) / denom
egen ckden = sum(exp(num)), by(board)
sum ckden denom
save _append,replace
/* we now have the probability of each seqeuence
now, aggregate for each variable */
capture erase _probs.dta
/* assume seven is max numbre of members on a board */
forvalues m = 1 / 7 {
use _append, clear
collapse (sum) prob, by(y`m' board)
keep if y`m'==1
gen member = `m'
keep prob* board member
capture append using _probs
sort board member
save _probs, replace
}
use _probs
table board, c(sum prob)
/* sum of prob = number of officers */
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index