Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: permutations


From   Christopher Intemann <intemann@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: permutations
Date   Mon, 17 Mar 2008 16:55:48 +0100

Hi Dave,
Am 17.03.2008 um 14:53 schrieb David Airey:

.

I calculated a test of a difference between any two SNP genotypes. You are saying you want to know which pairwise comparisons are different. If you just want those two probabilities (genotype 2 = genotype 0, genotype 1 = genotype 0), you get these by adding a couple lines to our code, and modifying the permute command as below. To get the other comparison (genotype 2 = genotype 1), you would need to use a different base comparison, or you would need to use the command lincom.

***do_begin***

describe
table age caco snp

program plogistic, rclass
version 10
args caco snp age
xi: logistic `caco' i.`snp' i.`age'
test _Isnp_1 _Isnp_2
return scalar chi2 = r(chi2)
test _Isnp_1
return scalar chi2_1 = r(chi2)
test _Isnp_2
return scalar chi2_2 = r(chi2)
end

set seed 3132008

permute caco chi2=r(chi2) chi2_1=r(chi2_1) chi2_2=r(chi2_2), ///
reps(1000): plogistic caco snp age

***do_end***


Monte Carlo permutation results Number of obs = 2394

command: plogistic caco snp age
chi2: r(chi2)
chi2_1: r(chi2_1)
chi2_2: r(chi2_2)
permute var: caco

------------------------------------------------------------------------------
T | T(obs) c n p=c/n SE(p) [95% Conf. Interval]
------------- +----------------------------------------------------------------
chi2 | .1560796 920 1000 0.9200 0.0086 . 9014202 .936058
chi2_1 | .1184664 724 1000 0.7240 0.0141 . 6951618 .7515127
chi2_2 | .1448991 719 1000 0.7190 0.0142 . 690024 .7466803
------------------------------------------------------------------------------
Note: confidence intervals are with respect to p=c/n.
Note: c = #{|T| >= |T(obs)|}

.

Thank very much you for your efforts.
Your solution does exactly what I was looking for; figured it out already:-)
The only thing is that I now have some trouble obtaining the full p- value.
<snip>
Monte Carlo permutation results Number of obs = 4356

command: plogistic affected_stata snp age ethno_x sex_x
chi2_1: r(chi2_1)
chi2_2: r(chi2_2)
permute var: affected_stata

------------------------------------------------------------------------------
T | T(obs) c n p=c/n SE(p) [95% Conf. Interval]
------------- +----------------------------------------------------------------
chi2_1 | 10.44501 2 1000 0.0020 0.0014 . 0002423 .0072058
chi2_2 | 16.76662 0 1000 0.0000 0.0000 0 .0036821
------------------------------------------------------------------------------
Note: confidence intervals are with respect to p=c/n.
Note: c = #{|T| >= |T(obs)|}

</snip>

According to stata help, results are stored in several matrices.
To get the p-values, I'm using these command:

matrix z=r(p)

. sca z1=z[1,1]

. sca z2=z[1,2]

However, this only leads me to the following, unsatisfacting output:

. sca list
z2 = 0
z1 = .002

Is this because the association (well known already, unfortunately:-)) is so strong, that stata cannot calculate the whole p-value?
How whould I cite this value? P<0.0001 ?
I would be happy if I could cite a more exact p-value.
Here is a list of all values permute stores in r()

Scalars
r(N) sample size
r(N_reps) number of requested replications
r(level) confidence level
r(k_exp) number of standard expressions
r(k_eexp) number of _b/_se expressions

Macros
r(cmd) permute
r(command) command following colon
r(permvar) permutation variable
r(title) title in output
r(exp#) #th expression
r(left) left or empty
r(right) right or empty
r(seed) initial random-number seed
r(event) T <= T(obs), T >= T(obs), or |T| <= |T(obs)|

Matrices
r(b) observed statistics
r(c) count when r(event) is true
r(reps) number of nonmissing results
r(p) observed proportions
r(se) standard errors of observed proportions
r(ci) confidence intervals of observed proportions


BTW: Does the seed set at the beginning of the calculations matter in any way?
Thanks again,
Chris



On Mar 17, 2008, at 5:23 AM, Christopher Intemann wrote:

Hello David,
thank you for the example, it was very helpful.
However, still the result is not what I'm actually up to.
In your example, you're calculating:
xi: logistic caco i.snp i.age
i.snp _Isnp_0-2 (naturally coded; _Isnp_0 omitted)
i.age _Iage_35-60 (naturally coded; _Iage_35 omitted)

Logistic regression Number of obs = 2116
LR chi2(7) = 21.17
Prob > chi2 = 0.0035
Log likelihood = -1455.9021 Pseudo R2 = 0.0072

------------------------------------------------------------------------------
caco | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
------------- +----------------------------------------------------------------
_Isnp_1 | .9580097 .1193996 -0.34 0.731 .7503816 1.223088
_Isnp_2 | .9524632 .1218649 -0.38 0.703 .7412068 1.223931
_Iage_40 | 1.63372 .2251451 3.56 0.000 1.247017 2.14034
_Iage_45 | 1.545441 .216941 3.10 0.002 1.173722 2.034885
_Iage_50 | 1.634428 .2582141 3.11 0.002 1.199194 2.227627
_Iage_55 | 1.09121 .184666 0.52 0.606 .7831755 1.520399
_Iage_60 | 1.202247 .1956237 1.13 0.258 .8739581 1.653854
------------------------------------------------------------------------------

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index