# Re: st: permutations

 From Christopher Intemann To statalist@hsphsun2.harvard.edu Subject Re: st: permutations Date Mon, 17 Mar 2008 11:23:20 +0100

```Hello David,
thank you for the example, it was very helpful.
However, still the result is not what I'm actually up to.
```
xi: logistic caco i.snp i.age
i.snp _Isnp_0-2 (naturally coded; _Isnp_0 omitted)
i.age _Iage_35-60 (naturally coded; _Iage_35 omitted)

Logistic regression Number of obs = 2116
LR chi2(7) = 21.17
Prob > chi2 = 0.0035
Log likelihood = -1455.9021 Pseudo R2 = 0.0072

------------------------------------------------------------------------------
caco | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
------------- +----------------------------------------------------------------
_Isnp_1 | .9580097 .1193996 -0.34 0.731 .7503816 1.223088
_Isnp_2 | .9524632 .1218649 -0.38 0.703 .7412068 1.223931
_Iage_40 | 1.63372 .2251451 3.56 0.000 1.247017 2.14034
_Iage_45 | 1.545441 .216941 3.10 0.002 1.173722 2.034885
_Iage_50 | 1.634428 .2582141 3.11 0.002 1.199194 2.227627
_Iage_55 | 1.09121 .184666 0.52 0.606 .7831755 1.520399
_Iage_60 | 1.202247 .1956237 1.13 0.258 .8739581 1.653854
------------------------------------------------------------------------------

At this point, you already receive p-values representing the association of the genotypes to caco.
At next you are performing a Wald-test (test _Isnp_1 _Isnp_2) to receive a p-value, which is then permutated.
What I am trying to do is to get permutations of the p-values received by logistic, as in your example, the values 0.731 and 0.703.
How could that be done?
Many thanks to you and to Carlo for pointing me on the Good's textbooks, I'm about to arrange myself a copy.
Best regards,
Chris

```. test _Isnp_1 _Isnp_2

( 1)  _Isnp_1 = 0
( 2)  _Isnp_2 = 0

chi2(  2) =    0.16
Prob > chi2 =    0.9249
```
```
Am 14.03.2008 um 15:27 schrieb David Airey:

```
.

Here's an example using Stata 9/10 syntax. With permutations, you have to be careful about how the data are permuted. Sometimes they have to be permuted in a way that respects the structure (strata) of the data. Read texts by Good to familiarize with these issues. Here I'm ignoring any issues like that.

Stata makes the general permutations very easy to do. You first define a program to grab a statistic associated with the test you want, and then you use that program in a permute command.

***do_begin***

describe
table age caco snp

program plogistic, rclass
version 10
args caco snp age
xi: logistic `caco' i.`snp' i.`age'
test _Isnp_1 _Isnp_2
return scalar chi2 = r(chi2)
end

set seed 3132008

permute caco chi2=r(chi2), reps(1000): plogistic caco snp age

***do_end***

results pasted:

. describe

Contains data from js_person_1.dta
obs: 2,394
vars: 3 10 Dec 2007 10:56
size: 40,698 (99.7% of memory free)
--------------------------------------------------------------------------------
storage display value
variable name type format label variable label
--------------------------------------------------------------------------------
caco byte %8.0g caco Group
age float %9.0g RECODE of age_group
snp float %9.0g C
--------------------------------------------------------------------------------
Sorted by:
Note: dataset has changed since last saved

. table age caco snp

--------------------------------------------------------------------
| C and Group
RECODE of | ------- 0 ------ ------- 1 ------ ------- 2 ------
age_group | Control Case Control Case Control Case
----------+---------------------------------------------------------
35 | 39 24 90 75 90 57
40 | 37 48 103 119 92 103
45 | 30 42 109 111 80 88
50 | 21 33 70 61 42 61
55 | 22 12 65 58 43 31
60 | 29 21 52 49 59 50
--------------------------------------------------------------------

.
. program plogistic, rclass
1. version 10
2. args caco snp age
3. xi: logistic `caco' i.`snp' i.`age'
4. test _Isnp_1 _Isnp_2
5. return scalar chi2 = r(chi2)
6. end

.
. set seed 3132008

.
. permute caco chi2=r(chi2), reps(1000): plogistic caco snp age
(running plogistic on estimation sample)

Permutation replications (1000)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50
.................................................. 100
.................................................. 150
.................................................. 200
.................................................. 250
.................................................. 300
.................................................. 350
.................................................. 400
.................................................. 450
.................................................. 500
.................................................. 550
.................................................. 600
.................................................. 650
.................................................. 700
.................................................. 750
.................................................. 800
.................................................. 850
.................................................. 900
.................................................. 950
.................................................. 1000

Monte Carlo permutation results Number of obs = 2394

command: plogistic caco snp age
chi2: r(chi2)
permute var: caco

------------------------------------------------------------------------------
T | T(obs) c n p=c/n SE(p) [95% Conf. Interval]
------------- +----------------------------------------------------------------
chi2 | .1560796 920 1000 0.9200 0.0086 . 9014202 .936058
------------------------------------------------------------------------------
Note: confidence interval is with respect to p=c/n.
Note: c = #{|T| >= |T(obs)|}

.
.
.
end of do-file

The p values for the genotype test was 0.9249, not much different.

. xi: logistic caco i.snp i.age
i.snp _Isnp_0-2 (naturally coded; _Isnp_0 omitted)
i.age _Iage_35-60 (naturally coded; _Iage_35 omitted)

Logistic regression Number of obs = 2116
LR chi2(7) = 21.17
Prob > chi2 = 0.0035
Log likelihood = -1455.9021 Pseudo R2 = 0.0072

------------------------------------------------------------------------------
caco | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
------------- +----------------------------------------------------------------
_Isnp_1 | .9580097 .1193996 -0.34 0.731 .7503816 1.223088
_Isnp_2 | .9524632 .1218649 -0.38 0.703 .7412068 1.223931
_Iage_40 | 1.63372 .2251451 3.56 0.000 1.247017 2.14034
_Iage_45 | 1.545441 .216941 3.10 0.002 1.173722 2.034885
_Iage_50 | 1.634428 .2582141 3.11 0.002 1.199194 2.227627
_Iage_55 | 1.09121 .184666 0.52 0.606 .7831755 1.520399
_Iage_60 | 1.202247 .1956237 1.13 0.258 .8739581 1.653854
------------------------------------------------------------------------------

. test _Isnp_1 _Isnp_2

( 1) _Isnp_1 = 0
( 2) _Isnp_2 = 0

chi2( 2) = 0.16
Prob > chi2 = 0.9249

On Mar 14, 2008, at 5:44 AM, Christopher Intemann wrote:

```Hi Dave
Am 13.03.2008 um 15:24 schrieb David Airey:

```
.

permute permvar exp_list [, options] : command

You specified permvar but not exp_list, so I'm not sure what permute is defaulting to, maybe just the overall model statistic? You can write a small program to return the coefficients for the genetic dummy variables, and an overall test of those, that gets called by permute. The ANOVA example in the printed help shows how to do this.

c is the number of times the statistics is greater than T(obs)
n is the default number of permutations since you didn't specify this either

Below is an example from ATS UCLA (Stata version 9/10):

use http://www.ats.ucla.edu/stat/stata/notes/hsb2, clear

* run logistic predicting "goodread" from "write"

* run permutation test for above with 10000 repetitions
permute female x2=e(chi2), reps(10000) nodots: logit goodread female

Since you want a statistic for the individual dummy variables and the test for both genetic dummy variables, you don't just want the results from the overall test, and you cannot just collect that using permute as the above example does.

I was using a pre-stata9 syntax of the permute command.

However, a command like

xi:permute affected_stata x2=e(chi2):logistic affected_stata i.gene age ethno_x sex_x

does calculate some permutations, but is pretty sure not what I want:-(

As i could calculate my p-values from the matrices e(b) and e(V), which are generated by logistic, i would rather like to use these in the exp_list. However, this does not work:
"matrix operators that return matrices not allowed in this context"

Is there any other way to permute or calculate the p-values?
Just similar to what other programs for genetic analysis (like unphased, plink or haploview) do?
I just came along cvpermute, maybe this could do the trick?
Chris
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
```--
David C. Airey, Ph.D.
Pharmacology Research Assistant Professor
Center for Human Genetics Research Member

Department of Pharmacology
School of Medicine
Vanderbilt University
Rm 8158A Bldg MR3
465 21st Avenue South
Nashville, TN 37232-8548

TEL   (615) 936-1510
FAX   (615) 936-3747
EMAIL david.airey@vanderbilt.edu
URL   http://people.vanderbilt.edu/~david.c.airey/dca_cv.pdf
URL   http://www.vanderbilt.edu/pharmacology

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```
```*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```