# Re: st: Why do some observations fail to be stored when I use a loopwith

 From n j cox <[email protected]> To [email protected] Subject Re: st: Why do some observations fail to be stored when I use a loopwith Date Sun, 28 Oct 2007 22:22:26 +0000

I confirm the kind of behaviour you report in example 2
within Stata 8.2. So, I suspect it is nothing to
do with the version of Stata. You have been bitten, I guess,
by a limitation of -genhwi-. Why should it produce
different results with the same data? Seeing a -sort-
in the code, this suggests to me some dependence somewhere on exact sort order, as with ties -sort- is not guaranteed to produce
the same sort order w.r.t. the original observation
numbers. I don't know why this should make a difference,
but I can't see anything else that could differ between runs.

What is more, the proportion of results I get back
as non-missing is close to 1 - exp(-1), which may
ring bells from a probability course. -sort, stable-
seems to improve the problem, but not to eliminate it.

I'd get in touch with the author, Mario A. Cleves. He
has not been a visible member of Statalist recently.

Nick
[email protected]

[email protected]

I have found the a problem, which, not surprisingly, I have no idea why it
happens. I am using Stata 9.2 SE (Windows XP emulated in Fedora).

Concisely, I wish to run a loop, compute a statistic, and store the
results in my current dataset. In this respect, the problem relies on the
fact that some observations of the statistic of interest are not
computed/stored.

For example, suppose I want to compute a P value from an exact test. Let�s
consider the exact p value obtained by -genwhi-, a program used in
Genetics.

----- Example 1 ---------------
clear
local example 10
set obs `example'
gene a = round(uniform()*25)
gene b = round(uniform()*25)
gene c = round(uniform()*25)

qui gene p_example=.

forvalues i = 1/`example' {
qui genhwi `=a[`i']' `=b[`i']' `=c[`i']'
qui replace p_example = r(p_exact) in `i'
}
---- End Example 1 -----------

Using the example just described above, I successfully get all results
without problems. Nevertheless, if I increase the number of observations,
say, local example 1000, then several observations are not
computed/stored. For example:

a b c p_example
15 7 4 .5175038
7 11 3 .
23 20 18 .
8 10 22 .
0 20 20 .0000885
17 23 1 .0132223
10 12 15 1
.
.
.
6 2 20
16 16 12 .0000974
4 17 16
7 2 8 .4895405

Interestingly enough, this also happens when all observations are equal,
that is:

----- Example 2 ---------------
clear
local example 1000
set obs `example'
gene a = 25
gene b = 50
gene c = 25

qui gene p_example=.

forvalues i = 1/`example' {
qui genhwi `=a[`i']' `=b[`i']' `=c[`i']'
qui replace p_example = r(p_exact) in `i'
}
---- End Example 2 -----------

which gives:

a b c p_example
25 50 25 .
25 50 25 1
25 50 25 .
25 50 25 1
25 50 25 1
25 50 25 1
25 50 25 .
25 50 25 1
.
.
.
25 50 25 .
25 50 25 1
25 50 25 .

I do not have Stata 10 in my home PC, but will try to investigate if the
same problem occurs in that version of Stata. If you have any tip and/or
suggestions, I will be, as usual, appreciating your time and help.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/