[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: implementation of boschloo's test: very slow execution

From   "Michael Blasnik" <>
To   <>
Subject   Re: implementation of boschloo's test: very slow execution
Date   Fri, 22 Feb 2008 14:18:14 -0500


I have a couple of potential speed imrpovement ideas (besides moving it to Mata):

1) Do you really need to get the p value to 4 decimals? If you changed

qui gen double p = (_n-1)/10001 to qui gen double p = (_n-1)/1001

Then you have a 90% reduction in some of the calculations.

2) I don't know much about this test, but wouldn't the optimum point be a smooth function of p? If that is so, you may want to create an iterative approach to narrowing the range of p. Start with perhaps p ranging by .01. Then just keep the interval on either side of the optimum and reduce the increment to .001. That may reduce the calculations substantially.

3) Don't use tabi. -tabi- requires a preserve and has a lot of ado machinery to set up the desired table. It ends up creating a dataset with RxC observations and a frequency weight variable that contains the counts. You could instead hardwire a 2x2 table with values for row, col, and fw. A loop would then just change the fw values to cycle through all of the n1 and n2 values. You could then use -post- to post the values of p_exact, xx1 and xx2. You may then be able to do the binomial calculation and summation just once on the resulting dastaset.

I'm not sure I understand the algorithm fully enough to be sure how these changes would work or could be optimized and combined, but I wouldn't be suprised with a large speed improvement.

Michael Blasnik

* For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index