Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

R: st: permutations


From   "Carlo Lazzaro" <carlo.lazzaro@tin.it>
To   <statalist@hsphsun2.harvard.edu>
Subject   R: st: permutations
Date   Fri, 14 Mar 2008 16:07:33 +0100

Dear Chris, as quoted by Dave, just two (hopefully useful) references for
Good's textbooks:

Phillip Good. Permutation, Parametric, and Bootstrap Tests of Hypotheses,
3rd Edition. Springer, 2005

Phillip I. Good. Resampling Methods: A Practical Guide to Data Analysis, 3rd
Edition. Birkhäuser, 2006

I would also take a look at Chapter 15 of Efron B, Tibshirani RJ. An
Introduction to the Bootstrap. New York: Chapman & Hall, 1993.

Kind Regards,

Carlo

-----Messaggio originale-----
Da: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di David Airey
Inviato: venerdì 14 marzo 2008 15.27
A: statalist@hsphsun2.harvard.edu
Oggetto: Re: st: permutations

.

Here's an example using Stata 9/10 syntax. With permutations, you have  
to be careful about how the data are permuted. Sometimes they have to  
be permuted in a way that respects the structure (strata) of the data.  
Read texts by Good to familiarize with these issues. Here I'm ignoring  
any issues like that.

Stata makes the general permutations very easy to do. You first define  
a program to grab a statistic associated with the test you want, and  
then you use that program in a permute command.

***do_begin***

describe
table age caco snp

program plogistic, rclass
	version 10
	args caco snp age
	xi: logistic `caco' i.`snp' i.`age'
	test _Isnp_1 _Isnp_2
	return scalar chi2 = r(chi2)
end

set seed 3132008

permute caco chi2=r(chi2), reps(1000): plogistic caco snp age

***do_end***

results pasted:


. describe

Contains data from js_person_1.dta
   obs:         2,394
  vars:             3                          10 Dec 2007 10:56
  size:        40,698 (99.7% of memory free)
----------------------------------------------------------------------------
----
               storage  display     value
variable name   type   format      label      variable label
----------------------------------------------------------------------------
----
caco            byte   %8.0g       caco       Group
age             float  %9.0g                  RECODE of age_group
snp             float  %9.0g                  C
----------------------------------------------------------------------------
----
Sorted by:
      Note:  dataset has changed since last saved

. table age caco snp

--------------------------------------------------------------------
           |                       C and Group
RECODE of | ------- 0 ------    ------- 1 ------    ------- 2 ------
age_group | Control     Case    Control     Case    Control     Case
----------+---------------------------------------------------------
        35 |      39       24         90       75         90       57
        40 |      37       48        103      119         92      103
        45 |      30       42        109      111         80       88
        50 |      21       33         70       61         42       61
        55 |      22       12         65       58         43       31
        60 |      29       21         52       49         59       50
--------------------------------------------------------------------

.
. program plogistic, rclass
   1.         version 10
   2.         args caco snp age
   3.         xi: logistic `caco' i.`snp' i.`age'
   4.         test _Isnp_1 _Isnp_2
   5.         return scalar chi2 = r(chi2)
   6. end

.
. set seed 3132008

.
. permute caco chi2=r(chi2), reps(1000): plogistic caco snp age
(running plogistic on estimation sample)

Permutation replications (1000)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
..................................................    50
..................................................   100
..................................................   150
..................................................   200
..................................................   250
..................................................   300
..................................................   350
..................................................   400
..................................................   450
..................................................   500
..................................................   550
..................................................   600
..................................................   650
..................................................   700
..................................................   750
..................................................   800
..................................................   850
..................................................   900
..................................................   950
..................................................  1000

Monte Carlo permutation results                   Number of obs    
=       2394

       command:  plogistic caco snp age
          chi2:  r(chi2)
   permute var:  caco

----------------------------------------------------------------------------
--
T            |     T(obs)       c       n   p=c/n   SE(p) [95% Conf.  
Interval]
------------- 
+----------------------------------------------------------------
         chi2 |   .1560796     920    1000  0.9200  0.0086  . 
9014202    .936058
----------------------------------------------------------------------------
--
Note:  confidence interval is with respect to p=c/n.
Note:  c = #{|T| >= |T(obs)|}

.
.
.
end of do-file


The p values for the genotype test was 0.9249, not much different.

. xi: logistic caco i.snp i.age
i.snp             _Isnp_0-2           (naturally coded; _Isnp_0 omitted)
i.age             _Iage_35-60         (naturally coded; _Iage_35  
omitted)

Logistic regression                               Number of obs    
=       2116
                                                   LR chi2(7)       
=      21.17
                                                   Prob > chi2      
=     0.0035
Log likelihood = -1455.9021                       Pseudo R2        
=     0.0072

----------------------------------------------------------------------------
--
         caco | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf.  
Interval]
------------- 
+----------------------------------------------------------------
      _Isnp_1 |   .9580097   .1193996    -0.34   0.731     .7503816     
1.223088
      _Isnp_2 |   .9524632   .1218649    -0.38   0.703     .7412068     
1.223931
     _Iage_40 |    1.63372   .2251451     3.56   0.000      
1.247017     2.14034
     _Iage_45 |   1.545441    .216941     3.10   0.002     1.173722     
2.034885
     _Iage_50 |   1.634428   .2582141     3.11   0.002     1.199194     
2.227627
     _Iage_55 |    1.09121    .184666     0.52   0.606     .7831755     
1.520399
     _Iage_60 |   1.202247   .1956237     1.13   0.258     .8739581     
1.653854
----------------------------------------------------------------------------
--

. test _Isnp_1 _Isnp_2

  ( 1)  _Isnp_1 = 0
  ( 2)  _Isnp_2 = 0

            chi2(  2) =    0.16
          Prob > chi2 =    0.9249





On Mar 14, 2008, at 5:44 AM, Christopher Intemann wrote:

> Hi Dave
> Am 13.03.2008 um 15:24 schrieb David Airey:
>
>> .
>>
>> Do you have access to the manual help or online help? The syntax is:
>>
>> permute permvar exp_list [, options] : command
>>
>> You specified permvar but not exp_list, so I'm not sure what  
>> permute is defaulting to, maybe just the overall model statistic?  
>> You can write a small program to return the coefficients for the  
>> genetic dummy variables, and an overall test of those, that gets  
>> called by permute. The ANOVA example in the printed help shows how  
>> to do this.
>>
>> T is your statistic
>> T(obs) is your observed statistic
>> c is the number of times the statistics is greater than T(obs)
>> n is the default number of permutations since you didn't specify  
>> this either
>>
>> Below is an example from ATS UCLA (Stata version 9/10):
>>
>> * read your data into Stata
>> use http://www.ats.ucla.edu/stat/stata/notes/hsb2, clear
>> generate goodread = (read > 60)
>>
>> * run logistic predicting "goodread" from "write"
>> logit goodread female
>>
>> * run permutation test for above with 10000 repetitions
>> permute female x2=e(chi2), reps(10000) nodots: logit goodread female
>>
>> Since you want a statistic for the individual dummy variables and  
>> the test for both genetic dummy variables, you don't just want the  
>> results from the overall test, and you cannot just collect that  
>> using permute as the above example does.
>>
>>
>
> I was using a pre-stata9 syntax of the permute command.
>
> However, a command like
>
> xi:permute affected_stata x2=e(chi2):logistic affected_stata i.gene  
> age ethno_x sex_x
>
> does calculate some permutations, but is pretty sure not what I  
> want:-(
>
> As i could calculate my p-values from the matrices e(b) and e(V),  
> which are generated by logistic, i would  rather like to use these  
> in the exp_list. However, this does not work:
> "matrix operators that return matrices not allowed in this context"
>
> Is there any other way to permute or calculate the p-values?
> Just similar to what other programs for genetic analysis (like  
> unphased, plink or haploview) do?
> I just came along cvpermute, maybe this could do the trick?
> Many thanks in advance,
>   Chris
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

--
David C. Airey, Ph.D.
Pharmacology Research Assistant Professor
Center for Human Genetics Research Member

Department of Pharmacology
School of Medicine
Vanderbilt University
Rm 8158A Bldg MR3
465 21st Avenue South
Nashville, TN 37232-8548

TEL   (615) 936-1510
FAX   (615) 936-3747
EMAIL david.airey@vanderbilt.edu
URL   http://people.vanderbilt.edu/~david.c.airey/dca_cv.pdf
URL   http://www.vanderbilt.edu/pharmacology



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index