# Re: st: permutations

 From David Airey To statalist@hsphsun2.harvard.edu Subject Re: st: permutations Date Mon, 17 Mar 2008 11:49:14 -0500

```On Mar 17, 2008, at 10:55 AM, Christopher Intemann wrote:

```
```Hi Dave,
Am 17.03.2008 um 14:53 schrieb David Airey:

```
.

I calculated a test of a difference between any two SNP genotypes. You are saying you want to know which pairwise comparisons are different. If you just want those two probabilities (genotype 2 = genotype 0, genotype 1 = genotype 0), you get these by adding a couple lines to our code, and modifying the permute command as below. To get the other comparison (genotype 2 = genotype 1), you would need to use a different base comparison, or you would need to use the command lincom.

***do_begin***

describe
table age caco snp

program plogistic, rclass
version 10
args caco snp age
xi: logistic `caco' i.`snp' i.`age'
test _Isnp_1 _Isnp_2
return scalar chi2 = r(chi2)
test _Isnp_1
return scalar chi2_1 = r(chi2)
test _Isnp_2
return scalar chi2_2 = r(chi2)
end

set seed 3132008

permute caco chi2=r(chi2) chi2_1=r(chi2_1) chi2_2=r(chi2_2), ///
reps(1000): plogistic caco snp age

***do_end***

Monte Carlo permutation results Number of obs = 2394

command: plogistic caco snp age
chi2: r(chi2)
chi2_1: r(chi2_1)
chi2_2: r(chi2_2)
permute var: caco

------------------------------------------------------------------------------
T | T(obs) c n p=c/n SE(p) [95% Conf. Interval]
------------- +----------------------------------------------------------------
chi2 | .1560796 920 1000 0.9200 0.0086 . 9014202 .936058
chi2_1 | .1184664 724 1000 0.7240 0.0141 . 6951618 .7515127
chi2_2 | .1448991 719 1000 0.7190 0.0142 . 690024 .7466803
------------------------------------------------------------------------------
Note: confidence intervals are with respect to p=c/n.
Note: c = #{|T| >= |T(obs)|}

.

Thank very much you for your efforts.
Your solution does exactly what I was looking for; figured it out already:-)
The only thing is that I now have some trouble obtaining the full p- value.
<snip>
Monte Carlo permutation results Number of obs = 4356

command: plogistic affected_stata snp age ethno_x sex_x
chi2_1: r(chi2_1)
chi2_2: r(chi2_2)
permute var: affected_stata

------------------------------------------------------------------------------
T | T(obs) c n p=c/n SE(p) [95% Conf. Interval]
------------- +----------------------------------------------------------------
chi2_1 | 10.44501 2 1000 0.0020 0.0014 . 0002423 .0072058
chi2_2 | 16.76662 0 1000 0.0000 0.0000 0 .0036821
------------------------------------------------------------------------------
Note: confidence intervals are with respect to p=c/n.
Note: c = #{|T| >= |T(obs)|}

</snip>

According to stata help, results are stored in several matrices.
To get the p-values, I'm using these command:

The p value is simply p=c/n in the tabled results. It is 0.002, and 0.000. Increase your number of permutations if you want something more exact than 2/1000, or 0/1000. Try 10,000 permutations, instead of 1000. You should try to report an exact probability. This is especially important if you are running many SNPs and later want to apply a multiple test correction procedure. There, 1x10-6 or 1x10-7 matters a lot.

The seed is necessary for replication. If you don't set the seed you will get chance variation in your results, because permute will be using different random numbers; setting the seed makes permute use the same random numbers. It will not affect your statistical conclusions. Most people will seed the random number generator.

-Dave

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/