.- help for ^permute^ @net:from http://www.stata.com/users/scarter!http://www.stata.com/users/~scarter@ .- Monte Carlo permutation tests ----------------------------- ^permute^ progname varname1 [varlist] [^,^ ^by(^groupvars^)^ ^r^eps^(^#^)^ ^di^splay^(^#^)^ { ^le^ft | ^ri^ght } no^p^rob ^eps(^#^)^ ^post(^filename^)^ ^do^uble ^ev^ery^(^#^)^ ^replace^ Description ----------- ^permute^ estimates p-values for permutation tests based on Monte Carlo simulations. progname is the name of a program that computes the test statistic and places its value in the global macro ^S_1^. The arguments to progname are varname1 [varlist]. For each repetition, the values of varname1 are randomly permuted, progname is called to compute the test statistic, and a count is kept whether this value of the test statistic is more extreme than the observed test statistic. The values of the test statistic for each random permutation can also be stored in a dataset using the ^post()^ option. Options ------- ^by(^groupvars^)^ specifies that the permutations be performed within each group defined by the values of groupvars; i.e., group membership is fixed and the values of varname1 are independently permuted within each group. For example, this permutation scheme is used for randomized-block anova to permute values within each block. ^reps(^#^)^ specifies the number of random permutations to perform. Default is 100. ^display(^#^)^ displays output every #-th random permutation. Default is 10. ^display(0)^ suppresses all but the final output. ^left^ | ^right^ request that one-sided p-values be computed. If ^left^ is specified, an estimate of Pr(T <= T(obs)) is produced, where T is the test statistic and T(obs) is its observed value. If ^right^ is specified, an estimate of Pr(T >= T(obs)) is produced. By default, two-sided p-values are computed; i.e., Pr(|T| >= |T(obs)|) is estimated. ^noprob^ specifies that no p-values are to be computed. ^eps(^#^)^ specified the numerical tolerance for testing |T| >= |T(obs)|, T <= T(obs), or T >= T(obs). These are considered true if, respectively, |T| >= |T(obs)| - #, T <= T(obs) + #, or T >= T(obs) - #. By default, it is 1e-7. ^eps()^ should not have to be set under normal circumstances. ^post(^filename^)^ specifies a name of a ^.dta^ file that will be created holding the values of the test statistic computed for each random permutation. ^double^ can only be specified when using ^post()^. It specifies that the values of the test statistic be stored as type ^double^; default is type ^float^. See help @datatypes@. ^every(^#^)^ can only be specified when using ^post()^. It specifies that the values of test statistic be saved to disk every #-th repetition; see help @postfile@. ^replace^ indicates that the file specified by ^post()^ may already exist and, if it does, it can be erased and replaced by a new one. Remarks ------- ^permute^ works faster when varname1 is a 0/1 variable (with no missing values). So, if using a 0/1 variable, specify it as the one to be permuted. Guidelines for the program -------------------------- progname must have the following outline: program define progname compute test statistic global S_1 = test statistic end Arguments to progname are varname1 [varlist]; i.e., the same variables that specified with ^permute^ are passed to progname. Here is an example of a program that estimates the permutation distribution p-value for the Pearson correlation coefficient: program define permpear quietly corr `1' `2' global S_1 = _result(4) end To use this program, call ^permute^ using . ^permute permpear x y^ In addition, the global macro S_1 is set to "first" for the first call to progname, which computes the observed test statistic T(obs); i.e., T(obs) is the value of the test statistic for the unpermuted data. Thus, progname can optionally have the form: program define progname /* args = varname1 [varlist] */ if "$S_1" == "first" { do initial computations } compute test statistic global S_1 = test statistic end Here is an example of a program that estimates the permutation distribution p-value for the two-sample t test: program define permt2 local grp "`1'" local x "`2'" tempvar sum quietly { if "$S_1"=="first" { gen double `sum' = sum(`x') scalar _TOTAL = `sum'[_N] drop `sum' summarize `grp' scalar _GROUP1 = _result(5) count if `grp'==_GROUP1 scalar _TOTAL = (_result(1)/_N)*_TOTAL } gen double `sum' = sum((`grp'==_GROUP1)*`x') global S_1 = `sum'[_N] - _TOTAL } end Note that the statistic T = (sum over i in group 1) x_i - n1 * x_bar is used, where x_bar is the mean of the groups combined and n1 is the number of obser- vations in group 1. This statistic is equivalent, under the permutation distribution, to the standard t statistic. To use this program, call ^permute^ using . ^permute permt2 group x^ Examples -------- . ^permute permpear x y^ . ^permute permpear x y, reps(1000)^ . ^permute permpear x y, reps(10000) display(100)^ . ^permute permpear x y, reps(1000) di(100) post(pearson)^ . ^permute permpear x y, reps(10000) di(1000) post(pearson) replace /*^ ^*/ every(1000) double^ . ^permute permt2 group x^ . ^permute permt2 group x, left^ . ^permute panova treat outcome subject, by(subject) reps(1000)^ Author ------ William Sribney, StataCorp, 1998. Also see -------- Manual: [5s] simul, [6a] postfile On-line: help for @postfile@, @simul@