# st: Problem in modifying built-in ktau.ado code (for Kendall's tau)

 From Mike Lacy <[email protected]> To [email protected] Subject st: Problem in modifying built-in ktau.ado code (for Kendall's tau) Date Tue, 03 Feb 2004 17:44:19 -0700

Greetings,

I need to calculate Kendall's tau as efficiently as possible on two continuous variables for each repetition in a large set of bootstrap experiments. Calculating Kendall's tau is slow, being of order (N^2)/2. My estimate is that using the built-in ktau.ado command, which is the fastest command to obtain Kendall's tau in Stata, the bootstrap experiments will take at least 10-15 days of dedicated time on a fast Wintel machine.

Consequently, I am trying to modify the code of the ktau.ado into my own version. In the middle of the loop in ktau.ado, I discovered there is a call to the sign() function, which accounts for about 70% of the execution time of ktau.ado. My idea was to avoid the overhead of a function call by replacing the sign() function with an equivalent set of If statements. So, I calculate my own "mysign" function value using Ifs and use it in place of sign() in the original code. However, I am not getting the right results, as I explain below after presenting the two relevant code fragments:

* Built-in ktau.ado code goes like this:
* View with a about a 10 pt nonproportional font
* `x' and `y' reference the two variables
* passed to the ktau command
gen double `work' = 0
scalar `k' = 2
while (`k' <= `N') {
local kk = `k' - 1
#delimit ;
replace `work' = `work'
+ sign((`x' - `x'[`k'])*(`y' - `y'[`k'])) /* slow */
in 1/`kk' ;
#delimit cr
scalar `k' = `k' + 1
}

Below is what I tried to do. And yes, I know that
if/else would be faster but I was trying to simplify
my code as much as possible while trying to track down
my error.

* My code, with differences indicated
gen double `work' = 0
scalar `k' = 2
while (`k' <= `N') {
local kk = `k' - 1
* Calc "mysign" as a substitute for the sign() function
* The rest of the code is as untouched as possible
scalar pairprod = (`x' - `x'[`k'])*(`y' - `y'[`k']) in 1/`kk'
if (pairprod < 0) {scalar mysign = -1}
if (pairprod > 0) {scalar mysign = 1}
if (pairprod == 0) {scalar mysign = 0}
if (pairprod == .) {scalar mysign = .}
*
#delimit ;
replace `work' = `work'
+ mysign
/* + sign((`x' - `x'[`k'])*(`y' - `y'[`k'])) Removed */
in 1/`kk' ;
#delimit cr
scalar `k' = `k' + 1
}

Problem: I have checked the values of "mysign" vs. sign{) and they are the same for each iteration of the loop. However, at the end of the while loop, the values in the variable `work' are not what they should be, so I presume the problem is with what I have done with the "replace" command. Could someone offer some suggestion?

Thanks,

=-=-=-=-=-=-=-=-=-=-=-=-=
Mike Lacy
Fort Collins CO USA
(970) 491-6721 office

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/