Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Comparing change in rates - frustrating problem, please help


From   Joseph Coveney <jcoveney@bigplanet.com>
To   Statalist <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Comparing change in rates - frustrating problem, please help
Date   Sat, 31 Jan 2004 13:32:21 +0900

Ricardo Ovaldia wrote:

>Thank you Joseph and Kieran. 
>I originally though to model this problem as Joseph's
>"ANCOVA-like approach" but without the interaction
>term (i.e.):
>
>xi: logistic followup i.baseline i.intervention
>
>If I do these, isn't the test: Beta(intervention)=0
>testing whether the intervention had an effect? I am
>not certain what the interaction term adds in this
>context? Please excuse me if this is a stupid
>question, but I do not get it. What am I missing? 

Well, here's my take on it:  the interaction term tests the analogue of what it 
would in a linear model--whether the intervention effect depends upon the level 
of baseline.  In one sense, it seems difficult to fathom in a pre-post design:  
if a person doesn't wear a seatbelt prior to intevention, then the odds that 
that person wears a seatbelt is zero and the the intervention odds ratio for a 
group of like-behaving people would be infinite, regardless of whether they 
wore seatbelts after intervention.  In another sense, however, the interaction 
term measures how justified you are in collapsing a 2 X 2 X 2 table (baseline X 
intervention X outcome) into a 2 X 2 table (intervention X outcome).  In this 
latter sense, it would test whether ratio of the odds that a person wears a 
seatbelt after experimental intervention to the odds that a person wears a 
seatbelt after control intervention needs to take into account the odds that 
the person wears a seatbelt before intevention.  The analogy with the linear 
model is seen better with -logit- and -lincom- as shown below.

clear
set obs 400
set seed 20040131
generate byte baseline = _n > _N / 2
generate byte treatment = mod(_n, 2)
generate byte result = uniform() < 1 / 4
replace result = uniform() < 3/4 if baseline == 1 & treatment == 0
table baseline treatment, contents(mean result)
generate byte iac = baseline * treatment
logistic result baseline treatment iac, nolog
* Creating the analogue of the "cell means model" of ANOVA
egen byte group = group(baseline treatment)
xi: logit result i.group, nolog
* The following linearized contrast is 
* the interaction term (iac) in -logistic- above.
lincom _Igroup_4 - _Igroup_3 - _Igroup_2
lincom _Igroup_4 - _Igroup_3 - _Igroup_2, or
exit


I'll take a crack at answering my own questions:

>1. Which is better for binary outcomes, Kieran's repeated-measures approach
>or an ANCOVA-like approach using the pretreatment values as a baseline
>covariate in conventional logistic regression?  The do-file below suggests
>that completely different conclusions would be drawn from the same dataset
>depending upon which approach is used to analyze it.

It looks like the ANCOVA-like approach is the one to use, from the results of a 
Monte Carlo simulation under the null hypothesis.  The false-positive rejection 
rate for the repeated-measures approach is orders of magnitude too high.  (See 
do-file below.)

>2. As Kieran mentioned, the repeated-measures approach drops one of the
>"main effects" (treatment) so that the model ends up having an interaction
>term in it when one of the component "main effects" terms contributing to
>the interaction is not in the model.  This would be a no-no from what I've
>heard, at least for the analogous situation in ANOVA.  But, I assume that
>this *not* a problem for conditional logistic regression due to the
>conditioning.  Is that correct?

Apparently not.  (See answer for 1. above.)

>2. When using the likelihood-ratio test (-lrtest-), which is the proper
>model against which to compare for testing individual "main effects" of
>treatment and baseline--the saturated model (*with* the interaction) or the
>partially reduced model (*no* interaction term, i.e., the model that
>includes only both of the main effects)?  Or should we be testing a
>constant-only model against one with the "main effect" in order to test that
>"main effect"?

Well, they test different hypotheses:  one tests whether there is an effect of 
intervention at both levels of baseline, and the other tests whether there is 
an effect of intervention at *any* level of baseline.  (Mentioned by Frank E. 
Harrell, Jr., in the context of clinical studies in 
hesweb1.med.virginia.edu/biostat/presentations/feh/covadj.pdf .)  I knew this, 
but this answer doesn't really answer my question:  which hypothesis ought we 
to be testing as a default when we believe that a baseline covariate is 
sufficiently important to include in a model a priori (in the protocol or 
statistical analysis plan)?

Joseph Coveney

Monte Carlo simulation evaluating null-hypothesis behavior of ANCOVA-like and 
repeated-measures approaches to pre-post design with binary endpoint:

clear
set more off
set seed 20040130
*
program define twolog, rclass
    version 8.2
    tempvar a b iac pid dep per
    tempname A B C
    drop _all
    set obs 200
    generate byte `a' = _n > _N / 2
    generate byte `b' = mod(_n, 2)
    generate byte `iac' = `a' * `b'
    generate byte `dep' = uniform() > 0.5
    logistic `dep' `a' `b' `iac'
    estimates store `A'
    logistic `dep' `a' `b'
    estimates store `B'
    logistic `dep' `b'
    lrtest `A' .
    return scalar ancova_iac = r(p)
    lrtest `B' .
    return scalar ancova_me = r(p)
    drop `iac'
    estimates drop _all
    generate int `pid' = _n
    rename `b' `dep'0
    rename `dep' `dep'1
    reshape long `dep', i(`pid') j(`per')
    generate byte `iac' = `a' * `per'
    clogit `dep' `a' `per' `iac', group(`pid')
    estimates store `C'
    clogit `dep' `per', group(`pid')
    lrtest `C' .
    return scalar rpm = r(p)
end
*
simulate "twolog" ancova_me = r(ancova_me) rpm = r(rpm) ///
  ancova_iac = r(ancova_iac), reps(3000)
generate byte pancova_me = ancova_me < 0.05
generate byte prpm = rpm < 0.05
generate byte pancova_iac = ancova_iac < 0.05
summarize p*
exit
  

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index