Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Comparing change in rates - frustrating problem, please help


From   Ricardo Ovaldia <ovaldia@yahoo.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Comparing change in rates - frustrating problem, please help
Date   Thu, 5 Feb 2004 16:07:33 -0800 (PST)

Thank you Joseph and Kieran. Obviously this was not
the easy question I though it was. I have spent
several days contemplating the answers and playing
around with my data. Although I find Kieran's
conditional logistic approach appealing, I understand
and agree with Joseph's concerns and objections. Faced
with the need to analyze these data and the eventual
submission for publication I fear that reviewers may
disagree with which ever method I select. The issue
becomes more complicated when one considers the effect
of additional covariates such as sex on the
intervention. 

Regardless of all this, I appreciate tremendously
Joseph and Kieran comments and time thinking about
this problem.

Ricardo.


--- Joseph Coveney <jcoveney@bigplanet.com> wrote:
> 
> Kieran McCaul posted results from a randomized
> parallel-group design study to 
> illustrate the use of conditional logistic
> regression.  The study randomized 
> households to an intervention designed to promote
> banning of smoking in the 
> home.  Policy in the home was measured before and
> after intervention.  Kieran 
> invited Ricardo and I to respond with what we think
> of advocating conditional 
> logistic regression to assess the efficacy of the
> intervention for before-and-
> after studies based upon the results posted for that
> study.
> 
> I don't claim to speak for Ricardo, but his original
> question related to 
> imbalances in the baseline rates of the outcome
> between the two parallel 
> intervention groups.  It appears that Kieran's study
> was successful in its 
> randomization (or used stratified randomization and
> didn't lose too many 
> households to dropout), because the proportions of
> households banning smoking 
> at baseline were nearly identical between the
> intervention groups.  With 
> essentially identical rates of baseline, there would
> be little or no cause for 
> concern about confounding due to it and little
> statistical difference in 
> including baseline as a covariate.  And, in fact,
> both conditional logistic 
> regression approach and the so-called ANCOVA-like
> multiple logistic regression 
> approach give essentially similar results in this
> balanced study.  (I think the 
> same would have obtained for Ricardo's study had the
> baseline rates of seatbelt 
> use been similar between the two intervention
> groups.)
> 
> But, let's look at the issue of which approach is
> more suitable when the 
> concern is, as it was for Ricardo, to analyze an
> intervention effect _in the 
> face of an imbalance in the baseline rates of an
> outcome_.
> 
> If Kieran will indulge me one more time to use a
> fictional dataset to 
> illustrate a point, let's say that Kieran's
> randomization method did not 
> stratify on baseline household smoking policy, and
> suffered an unfortunate 
> imbalance due to chance, for instance a 50 : 50
> ratio of households banning 
> smoking at baseline in the nonintervention group,
> but a 75 : 25 ratio in the 
> intervention group.  Let's say that 2 of the 50
> households that previously 
> banned smoking in the nonintervention group now
> permit it, a worsening of 4% 
> (if your health policy is to ban smoking), and that
> only 1 of the 50 households 
> that didn't ban smoking now do so in the
> nonintervention group, a meager 
> improvement of 2%.  Let's say that 4 of the 75
> households that banned smoking 
> at baseline switched and permitted smoking in the
> home after the intervention, 
> and 2 of the 25 households that didn't ban smoking
> switched as a result of the 
> intervention.  The results of the intervention are a
> slightly greater 5.3% 
> worsening (compare to 4%) in the former nonbanning
> household population, but a 
> much greater 8% (compare to 2%) improvement among
> the formerly permissive 
> households.  
> 
> Now, the effects of intervention are no great
> shakes, but I think that it would 
> be safe to say that it's not *nothing*, especially
> if you somehow take into 
> account the possible confounding effect of the
> chance unfortunate imbalance in 
> baseline policy between treatment groups.
> 
> But, by the conditional logistic regression
> approach, it *is* nothing--the odds 
> ratio for both nonintervention and intervention
> groups is 0.5 (McNemar's test 
> uses only the off-diagonal values and ignores the
> diagonal values) so the ratio 
> of the two odds ratios is 1.0, and this is what the
> conditional logistic 
> regression dutifully reports:  the period term is
> 0.5 and the interaction 
> term's odds ratio is 1.0 with a Z-statistic of 0.00
> and a p-value of 1.00.  
> Granted, the confidence interval encompasses a lot,
> but the point estimate and 
> hypothesis test for the interaction term (which is
> ostensibly the effect of 
> intervention) just don't give the same take-home
> message as inspection of the 
> data.  So, my conclusion differs from Kieran's on
> this; I don't think that 
> conditional logistic regression is valid to test for
> differences between 
> treatment effects (differences between treatment
> differences, which are between-
> subject effects) in parallel-group designs with a
> repeated binary outcome 
> measure, especially in the presence of baseline
> differences in the outcome 
> measure, which are ignored in the conditional
> logistic model.
> 
> In contrast, the ANCOVA-like, baseline-as-covariate
> multiple regression 
> approach does provide a separate, and I think
> competent, handling of baseline 
> differences and their potential for confounding.  In
> the fictitious example, 
> this approach shows the pronounced effect of
> baseline smoking policy as 
> expected, and it shows that the odds ratio for
> intervention isn't 1.0 given 
> baseline differences between intervention groups. 
> The saturated model (with 
> the interaction term) also helps to put the
> potential for confounding into 
> perspective.  (The do-file for all of this is below
> for anyone interested.)
> 
> It seems that at least some of the discrepancy
> between the two approaches 
> reflects Simpson's paradox.  This is the same
> underlying phenomenon that 
> results in bias in logistic regression coefficients
> (and in nonlinear 
> regression, in general) when important covariates
> are left out of the model.  
> This is what Frank E. Harrell Jr.'s lecture dealt
> with in the URL given in my 
> last posting.  And it relates to the
> "noncollapsibility of odds ratios" that 
> epidemiologists sometimes refer to.
> 
> In fairness to us all (Kieran, Ricardo and me), it
> seems that the matter of 
> which approach is better isn't completely settled
> even for *linear* models, 
> where this incollapsibility-of-odds-ratios
> phenomenon and the incidental 
> parameters problem don't apply:  there is a thread
> ("Repeated measures and 
> including time zero response as baseline covariate")
> on sci.stat.consult that 
> was started on May 7 of last year by Frank Harrell. 
> Professor Harrell wrote a 
> well received book on regression modeling and is now
> chairman of a department 
> of biostatistics, yet even he asks, "Has anyone come
> across some practical 
> guidance for when to include the first measured
> response (at time zero) as a 
> baseline covariate as opposed to the first repeated
> measurement in a 
> longitudinal data analysis?"
> 
> Joseph Coveney
> 
>
-------------------------------------------------------------------------------
> 
> clear
> tempfile tmp
> set obs 100
> generate byte ban0 = _n > _N / 4
> generate byte ban1 = ban0
> replace ban1 = !ban1 in 50/53
> replace ban1 = !ban1 in 1/2
> *
> * Intervention group
> *
> display 4 / 75  // switching by banners
> display 2 / 25  // switching by permitters
> mcc ban1 ban0
> generate byte intervention = 1
> save `tmp'
> clear
> set obs 100
> generate byte ban0 = _n > _N / 2
> generate byte ban1 = ban0
> replace ban1 = !ban1 in 50/52
> *
> * Nonintervention group
> *
> 
=== message truncated ===


=====
Ricardo Ovaldia, MS
Statistician 
Oklahoma City, OK

__________________________________
Do you Yahoo!?
Yahoo! Finance: Get your refund fast by filing online.
http://taxes.yahoo.com/filing.html
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index