# Re: st: Sample size for equivalence trials in Stata

 From Joseph Coveney To Statalist Subject Re: st: Sample size for equivalence trials in Stata Date Sun, 02 Oct 2005 05:14:21 +0900

```Francesco wrote:

. . . I have a continuous outcome (mean number of events).

```
```Regardless of whether you're working with proportions or another data
type,
simulation would be an option.
```
```This is very interesting, but I don't know how to it... could you help me
with an example?

--------------------------------------------------------------------------------

My description of the algorithm was a mess.  A follow-up post never cleared
the listserver, so I've re-posted it below.  To answer your question, you
usually wouldn't need to resort to simulation if you have a straightforward
experiment with continuous (normal) data.  But an example where simulation
might come in handy with a linear model is shown below.  The fictional
scenario is an anticipated repeated-measurements (longitudinal)
bioequivalence study with a 2:1 randomization, a baseline measurement and
two posttreatment measurements, the second of which is subject to dropout,
which is assumed to occur completely at random.  Random-effects regression
is chosen for analysis, and we'd like to explore sample-size requirements
under various assumptions as to the dropout rate, one of which (15% dropout)
shown.  The set-up also would allow for exploration of various other
assumptions, too, for example, magnitude of correlations.  I haven't shown
the necessary follow-on simulation to estimate the
false equivalence-declaration rate.  For this, you would introduce a "true"
difference, e.g., the delta (boundary value of the null/alternative
hypotheses), and then determine how often equivalence is declared.

Sample size estimation by simulation is iterative; when you have no idea
at each of a broad range of candidate sample sizes.  After refining the
range of candidate sample sizes, the number of replications can be increased
as desired to improve precision.  It's usually not worthwhile fretting too
much over precision, given the influence on accuracy of the crudity of
assumptions that is typical.

Joseph Coveney

Lost posting:

generate a thousand or so datasets under the alternative hypothesis
(identical proportions for the two groups), determine the proportion in
which the 2-alpha-level confidence interval of the risk difference (ratio)
is contained within -delta to +delta (1/delta to delta).  Iterate with a
larger or smaller candidate sample size until converging upon the desired
power.

Example:

capture program drop mclinsim
program define mclinsim, rclass
syntax , [corr(real 0) alpha(real 0.05)] ///
delta(real) n1(integer) n2(integer) ///
DROPout(real) clear
tempvar baseline response group id
local difference = invnorm(1 - `alpha') // 2 alpha
drawnorm `baseline' `response'1 `response'2, ///
corr(1 `corr' `corr' \ `corr' 1 `corr' \ ///
`corr' `corr' 1) n(`= `n1' + `n2'') clear
quietly replace `response'2 = . if uniform() < 0.15
generate byte `group' = _n > `n1'
generate int `id' = _n
quietly reshape long `response', i(`id') j(`time')
quietly xtreg `response' `group' `baseline' `time', i(`id')
quietly lincom `group'
local difference = `difference' * r(se)
return scalar is_equivalent = ///
( (r(estimate) - `difference' > -`delta') & ///
(r(estimate) + `difference' < `delta') )
end
*
* 2:1 randomization; 50% correlation coefficient; 15% dropout rate;
* 0.5 SE delta
forvalues sample_size = 10(10)40 {
simulate is_equivalent = r(is_equivalent), reps(100)  nodots ///
seed(`=date("2005-10-03", "ymd")'): mclinsim , corr(0.5) ///
delta(0.5) n1(`sample_size') n2(`= 2* `sample_size'') drop(0.15) clear
summarize is_equivalent, meanonly
display in smcl as text "n1: " as result `sample_size'
display as text "n2: " as result 2 * `sample_size'
display as text "Power: " as result %4.2f r(mean)
display
}
exit

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```