Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Sample size for equivalence trials in Stata


From   Joseph Coveney <[email protected]>
To   Statalist <[email protected]>
Subject   Re: st: Sample size for equivalence trials in Stata
Date   Sun, 02 Oct 2005 05:14:21 +0900

Francesco wrote:

. . . I have a continuous outcome (mean number of events).

Regardless of whether you're working with proportions or another data
type,
simulation would be an option.
This is very interesting, but I don't know how to it... could you help me
with an example?

--------------------------------------------------------------------------------

My description of the algorithm was a mess.  A follow-up post never cleared
the listserver, so I've re-posted it below.  To answer your question, you
usually wouldn't need to resort to simulation if you have a straightforward
experiment with continuous (normal) data.  But an example where simulation
might come in handy with a linear model is shown below.  The fictional
scenario is an anticipated repeated-measurements (longitudinal)
bioequivalence study with a 2:1 randomization, a baseline measurement and
two posttreatment measurements, the second of which is subject to dropout,
which is assumed to occur completely at random.  Random-effects regression
is chosen for analysis, and we'd like to explore sample-size requirements
under various assumptions as to the dropout rate, one of which (15% dropout)
shown.  The set-up also would allow for exploration of various other
assumptions, too, for example, magnitude of correlations.  I haven't shown
the necessary follow-on simulation to estimate the
false equivalence-declaration rate.  For this, you would introduce a "true"
difference, e.g., the delta (boundary value of the null/alternative
hypotheses), and then determine how often equivalence is declared.

Sample size estimation by simulation is iterative; when you have no idea
where to begin, you would typically start with relatively fewer replications
at each of a broad range of candidate sample sizes.  After refining the
range of candidate sample sizes, the number of replications can be increased
as desired to improve precision.  It's usually not worthwhile fretting too
much over precision, given the influence on accuracy of the crudity of
assumptions that is typical.

Joseph Coveney

Lost posting:

Let's try that alogrithm again:  Start with a candidate sample size,
generate a thousand or so datasets under the alternative hypothesis
(identical proportions for the two groups), determine the proportion in
which the 2-alpha-level confidence interval of the risk difference (ratio)
is contained within -delta to +delta (1/delta to delta).  Iterate with a
larger or smaller candidate sample size until converging upon the desired
power.

Example:

capture program drop mclinsim
program define mclinsim, rclass
   syntax , [corr(real 0) alpha(real 0.05)] ///
     delta(real) n1(integer) n2(integer) ///
     DROPout(real) clear
   tempvar baseline response group id
   local difference = invnorm(1 - `alpha') // 2 alpha
   drawnorm `baseline' `response'1 `response'2, ///
     corr(1 `corr' `corr' \ `corr' 1 `corr' \ ///
     `corr' `corr' 1) n(`= `n1' + `n2'') clear
   quietly replace `response'2 = . if uniform() < 0.15
   generate byte `group' = _n > `n1'
   generate int `id' = _n
   quietly reshape long `response', i(`id') j(`time')
   quietly xtreg `response' `group' `baseline' `time', i(`id')
   quietly lincom `group'
   local difference = `difference' * r(se)
   return scalar is_equivalent = ///
     ( (r(estimate) - `difference' > -`delta') & ///
      (r(estimate) + `difference' < `delta') )
end
*
* 2:1 randomization; 50% correlation coefficient; 15% dropout rate;
* 0.5 SE delta
forvalues sample_size = 10(10)40 {
   simulate is_equivalent = r(is_equivalent), reps(100)  nodots ///
     seed(`=date("2005-10-03", "ymd")'): mclinsim , corr(0.5) ///
     delta(0.5) n1(`sample_size') n2(`= 2* `sample_size'') drop(0.15) clear
   summarize is_equivalent, meanonly
   display in smcl as text "n1: " as result `sample_size'
   display as text "n2: " as result 2 * `sample_size'
   display as text "Power: " as result %4.2f r(mean)
   display
}
exit

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index