[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
n j cox <n.j.cox@durham.ac.uk> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Why do some observations fail to be stored when I use a loopwith |

Date |
Sun, 28 Oct 2007 22:22:26 +0000 |

I confirm the kind of behaviour you report in example 2

within Stata 8.2. So, I suspect it is nothing to

do with the version of Stata. You have been bitten, I guess,

by a limitation of -genhwi-. Why should it produce

different results with the same data? Seeing a -sort-

in the code, this suggests to me some dependence somewhere on exact sort order, as with ties -sort- is not guaranteed to produce

the same sort order w.r.t. the original observation

numbers. I don't know why this should make a difference,

but I can't see anything else that could differ between runs.

What is more, the proportion of results I get back

as non-missing is close to 1 - exp(-1), which may

ring bells from a probability course. -sort, stable-

seems to improve the problem, but not to eliminate it.

I'd get in touch with the author, Mario A. Cleves. He

has not been a visible member of Statalist recently.

Nick

n.j.cox@durham.ac.uk

tiago.pereira@incor.usp.br

I have found the a problem, which, not surprisingly, I have no idea why it

happens. I am using Stata 9.2 SE (Windows XP emulated in Fedora).

Concisely, I wish to run a loop, compute a statistic, and store the

results in my current dataset. In this respect, the problem relies on the

fact that some observations of the statistic of interest are not

computed/stored.

For example, suppose I want to compute a P value from an exact test. Letīs

consider the exact p value obtained by -genwhi-, a program used in

Genetics.

----- Example 1 ---------------

clear

local example 10

set obs `example'

gene a = round(uniform()*25)

gene b = round(uniform()*25)

gene c = round(uniform()*25)

qui gene p_example=.

forvalues i = 1/`example' {

qui genhwi `=a[`i']' `=b[`i']' `=c[`i']'

qui replace p_example = r(p_exact) in `i'

}

---- End Example 1 -----------

Using the example just described above, I successfully get all results

without problems. Nevertheless, if I increase the number of observations,

say, local example 1000, then several observations are not

computed/stored. For example:

a b c p_example

15 7 4 .5175038

7 11 3 .

23 20 18 .

8 10 22 .

0 20 20 .0000885

17 23 1 .0132223

10 12 15 1

.

.

.

6 2 20

16 16 12 .0000974

4 17 16

7 2 8 .4895405

Interestingly enough, this also happens when all observations are equal,

that is:

----- Example 2 ---------------

clear

local example 1000

set obs `example'

gene a = 25

gene b = 50

gene c = 25

qui gene p_example=.

forvalues i = 1/`example' {

qui genhwi `=a[`i']' `=b[`i']' `=c[`i']'

qui replace p_example = r(p_exact) in `i'

}

---- End Example 2 -----------

which gives:

a b c p_example

25 50 25 .

25 50 25 1

25 50 25 .

25 50 25 1

25 50 25 1

25 50 25 1

25 50 25 .

25 50 25 1

.

.

.

25 50 25 .

25 50 25 1

25 50 25 .

I do not have Stata 10 in my home PC, but will try to investigate if the

same problem occurs in that version of Stata. If you have any tip and/or

suggestions, I will be, as usual, appreciating your time and help.

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: -spineplot- available from SSC** - Next by Date:
**st: frailty in parametric models** - Previous by thread:
**st: -spineplot- available from SSC** - Next by thread:
**st: Can I repeatedly sample with constraints from an unbalanced data set to balance it?** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |