Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Paired proportions- power calculation


From   Joseph Coveney <[email protected]>
To   Statalist <[email protected]>
Subject   Re: st: Paired proportions- power calculation
Date   Wed, 27 Aug 2003 03:53:25 +0900

Andrew McIntosh posted:

--------------------------------------------------------------------------------

I (will) have a sample of 30 pairs of people and I want to calculate the
power to detect a difference of 5, 10, or 15 events between the two
(paired ) groups, assuming the control group have an event rate of about
50%.

Unforunately, the 'sampsi' command doesn't seem to extend to matched
pairs and I wondered if there was a method of completing this
analysis using another STATA command? Thank you for any help...

--------------------------------------------------------------------------------

Simulation might be worth considering here.  Alan Feiveson wrote an article in 
_Stata Journal_ last year (Volume 2, Issue 2) that might be of interest.

As a first-pass crude illustration of its potential here, I've appended a 
rudimentary Monte Carlo do-file to this posting.  It estimates power for two 
tests, McNemar's test and random-effects probit regression, for a given level 
of Type 1 error rate for a given correlation coefficient between the variables 
for the sample size of 30 pairs of binary outcomes and a given delta of 5, 10 
or 15 events.  (Based upon Andrew's casting of the control group event rate as 
a proportion, I'm assuming that the observation intervals or exposures are the 
same for everyone.)

I've parameterized the underlying generating function in probit terms--see J. 
S. Long, _Regression Models for Categorical and Limited Dependent Variables_ 
(Thousand Oaks, Calif.:  Sage, 1997), pp. 40-50.  This was because specifying 
correlated binomial random variates seems feasible that way.  (There is an 
alternative, apparently easily implemented algorithm for generating correlated 
binomial random variates by A. D. Lunn and S. J. Davies, A note on generated 
correlated binary variables.  _Biometrika_ 85:487-90, 1998.)

The do-file uses -simulate-.  The called program expects options for the delta 
(del: 5, 10 or 15 in Andrew's case), whether the likelihood ratio test for -
xtprobit- is desired (lrt), the tetrachoric correlation coefficient (cor) and 
the level of the Type 1 error rate (sig).  The first of the three returned 
results (st_pos) is the number of statistically significant test results by 
McNemar's test (which is equivalent to the sign test), and the second (lr_pos) 
is the number of statistically significant test results for a likelihood ratio 
test after -xtprobit-.  The averages of each of these, as reported by -
summarize-, gives the power of each test under the conditions specified.  The 
third returned result (rho) is the estimate of the rho parameter as reported by 
-xtprobit-.

Although the phenomenon being simulated doesn't need to be regarded as a 
manifestation of a latent trait or latent variable, a tetrachoric correlation 
coefficient needs to be specified in order to perform this do-file's 
simulation.  In the do-file below, I've spanned correlation coefficients from 
50% through 70% to an improbable 90%. The number of repetitions is set to 400 
in order to get an idea of the tests' performances sooner than with a large 
number.

I've included a redacted output after the do-file.  You can see that the test 
size for the two tests when the delta (del) is set to zero, and their relative 
power when del is 5 or 10.  I've trapped cases from going on to -xtprobit- 
where the observed proportion is zero or one for the test group, since maximum 
likelihood methods don't converge reliably in such cases; this ended up 
trapping all of the cases when the delta was 15 (expected proportions of zero 
or one).

It is often stated that maximum likelihood methods are too far from asymptotic 
conditions for use with small sample sizes, so -xtprobit- would seem a dicey 
proposition with only 30 pairs of observations, but the test size--except in 
the extreme-correlation case--is about 10%, which is what I set the level of 
Type 1 error rate at here.

Joseph Coveney

--------------------------begin crude illustrative do-file----------------------

program define corbinsim, rclass
    syntax , del(integer) [lrt cor(real 0.5) sig(real 0.05)]
    local tes = invnorm( (15 + (2 * (uniform() > 0.5) - 1) ///
      * `del') / 30 )
    drawnorm pi0 pi1, means(0 `tes') sd(1 1) ///
      corr(1 `cor' \ `cor' 1) n(30) clear
    foreach var of varlist pi0 pi1 {
        replace `var' = `var' > 0
    }
    signtest pi0 = pi1
    return scalar st_pos  = r(p_2) < `sig'
    if "`lrt'" == "lrt" {
        summarize pi1, meanonly
        if r(mean) > 0 & r(mean) < 1 {
            generate byte pid = _n
            reshape long pi, i(pid) j(trt)
            xtprobit pi trt, i(pid)
            return scalar rho = e(rho)
            estimates store A
            xtprobit pi, i(pid)
            lrtest A
            return scalar lr_pos = r(p) < `sig'
        }
        else {
            return scalar lr_pos = .
        }
    }
end
*
set more off
set seed 20030826
* power for Type 1 error rate of 10%
*
* 70% tetrachoric correlation coefficient
*
local lrt "lrt"
forvalues dif = 0(5)15 {
    simulate "corbinsim, del(`dif') `lrt' cor(0.7) sig(0.10)" ///
      st_pos = r(st_pos) lr_pos = r(lr_pos) rho = r(rho), reps(400)
    display
    display as result `dif'
    summarize st_pos lr_pos rho
}
*
* 50% correlation coefficient
*
forvalues dif = 0(5)15 {
    simulate "corbinsim, del(`dif') `lrt' cor(0.5) sig(0.10)" ///
      st_pos = r(st_pos) lr_pos = r(lr_pos) rho = r(rho), reps(400)
    display
    display as result `dif'
    summarize st_pos lr_pos rho
}
*
* 90% correlation coefficient
*
forvalues dif = 0(5)15 {
    simulate "corbinsim, del(`dif') `lrt' cor(0.9) sig(0.10)" ///
      st_pos = r(st_pos) lr_pos = r(lr_pos) rho = r(rho), reps(400)
    display
    display as result `dif'
    summarize st_pos lr_pos rho
}
exit

------------------------end crude illustrative do-file-------------------------

--------------------begin edited output from Results screen---------------------


. * power for Type 1 error rate of 10%
. *
. * 70% tetrachoric correlation coefficient

command:      corbinsim , del(0) lrt cor(0.7) sig(0.10)
0

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
      st_pos |       398    .0351759     .184456          0          1
      lr_pos |       398     .120603    .3260753          0          1
         rho |       398    .6789901     .196932   8.32e-07    .995425

command:      corbinsim , del(5) lrt cor(0.7) sig(0.10)
5

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
      st_pos |       400         .39    .4883608          0          1
      lr_pos |       400       .5975    .4910158          0          1
         rho |       400    .6923904    .2084408   8.32e-07   .9955038

command:      corbinsim , del(10) lrt cor(0.7) sig(0.10)
10

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
      st_pos |       398    .9371859    .2429336          0          1
      lr_pos |       397    .9823678    .1317767          0          1
         rho |       397    .8071645     .265319   8.32e-07    .995355

command:      corbinsim , del(15) lrt cor(0.7) sig(0.10)
15

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
      st_pos |       400           1           0          1          1
      lr_pos |         0
         rho |         0

. * 50% coefficient

command:      corbinsim , del(0) lrt cor(0.5) sig(0.10)
0

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
      st_pos |       399    .0325815    .1777614          0          1
      lr_pos |       399    .1027569    .3040223          0          1
         rho |       399    .4854253    .2270576   8.32e-07   .9953463

command:      corbinsim , del(5) lrt cor(0.5) sig(0.10)
5

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
      st_pos |       394    .3527919    .4784462          0          1
      lr_pos |       394    .5050761    .5006099          0          1
         rho |       394    .5023774    .2358462   8.32e-07   .9937819

command:      corbinsim , del(10) lrt cor(0.5) sig(0.10)
10

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
      st_pos |       391    .9002558    .3000426          0          1
      lr_pos |       391    .9667519    .1795134          0          1
         rho |       391    .6246353    .3290946   8.32e-07   .9937818

command:      corbinsim , del(15) lrt cor(0.5) sig(0.10)
15

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
      st_pos |       400           1           0          1          1
      lr_pos |         0
         rho |         0

. * 90% coefficient

command:      corbinsim , del(0) lrt cor(0.9) sig(0.10)
0

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
      st_pos |       390    .0282051    .1657711          0          1
      lr_pos |       390    .1794872    .3842527          0          1
         rho |       390    .8963201     .097665   .3548582   .9972232

command:      corbinsim , del(5) lrt cor(0.9) sig(0.10)
5

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
      st_pos |       396    .5075758     .500575          0          1
      lr_pos |       396    .8257576    .3797977          0          1
         rho |       396    .9252776    .1090395   .4977363   .9956096

command:      corbinsim , del(10) lrt cor(0.9) sig(0.10)
10

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
      st_pos |       400        .985    .1217047          0          1
      lr_pos |       399    .9974937    .0500626          0          1
         rho |       399     .971989    .0688614   .3453863   .9954332

command:      corbinsim , del(15) lrt cor(0.9) sig(0.10)
15

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
      st_pos |       400           1           0          1          1
      lr_pos |         0
         rho |         0

----------------------------end edited output-----------------------------------



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index