[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Joseph Coveney <jcoveney@bigplanet.com> |

To |
Statalist <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: Comparing change in rates - frustrating problem: questionableresults |

Date |
Wed, 11 Feb 2004 20:50:45 +0900 |

In the ongoing discussion of multiple logistic regression versus conditional logistic regression approaches to analysis of data from a randomized, prospective study with a parallel-groups design that includes baseline measurement of the binary outcome variable, Ricardo Ovaldia presented a dataset and regression results by both approaches. >Continuing on a previous discussion, I applied both >Joseph's and Kieran's method to a a large set of the >seat belt intervention data and obtained some >questionable results. Here is a summary table: > [redacted] > >With Joseph's method the p-value for the interaction >is 0.683, indicating no treatment effect. >But with Kieran's method the p-value is 0.032 >indicating a significant treatment effect. Looking at >the actual data I believe the results from the >conditional logistic more than the "MANOVA" like >approach, given that the baselines are similar. > >What am I missing? > >Thank you, >Ricardo. With ANCOVA-like multiple logistic regression, which treats the baseline as a predictor (covariate), the "main effects" of intervention (treatment) in Ricardo's dataset were associated with a Z-statistic of 2.42 (P < 0.05). The corresponding slope coefficient for the baseline covariate was asssociated with a Z-statistic of 4.96 (P < 0.05). As Ricardo noted, the interaction term was associated with a Z-statistic of -0.41 (P > 0.05). My interpretation of this is that (i) intervention *does* result in a statistically significant difference (at the 5% level) in seatbelt usage from nonintervention, with an estimated odds ratio of 2 (95% confidence interval: 1-4), (ii) as expected, pretreatment seatbelt use strongly predicts postreatment usage, with an odds ratio of 7 (3-14), and (iii) pretreatment seatbelt usage and intervention do not interact. Note that the interpretation is straightforward and analogous to that for ANCOVA in that statistical significance of the baseline-by-treatment interaction term is not needed to infer that treament or intervention displays an effect. I don't know how to interpret the results from the corresponding conditional logistic regression printout, except to note, as Kieran McCaul earlier showed, that the odds ratios reflect those obtained in separate McNemar tests of the two treatment groups. From the brief numerical study described below, it is apparent that, while the p-value from the conditional logistic regression approach is affected by a lot of (irrelevant) things, it does not reflect the statistical significance of the treatment factor, and therefore this approach is unsuitable for this type of study design. In the simulation, all scenarios have equal treatment group sizes, which is the expectation in a randomized parallel-group design with a 1:1 treatment assigment ratio. The first three settings (simulations) evaluate the "test size" (Type I error rate) of both approaches under a variety of conditions. The first setting has unbalanced baseline rates of responses and equal rates of changeover due to treatment. The second setting balances the baseline rates of response and maintains the equal rates of changeover. The third setting maintains the balance in both baseline and changeover rates, and just increases the changeover rates. The fourth and fifth settings illustrate the relative power of the two approaches. Setting 1 presents the case with a baseline imbalance: 1/3 positive response for control treatment and 2/3 in experimental (intervention) treatment. There is no treatment effect: 25% conversion in each treatment group at each level of baseline response. The Type 1 error rates for the ANCOVA-like multiple logistic regression are, as expected, in the 4-5% range. But that for the conditional logistic approach yields a whopping 64% false-positive error rate, an order of magnitude greater than the nominal level. (Detailed results are shown below; apologies for the length of the post.) This reflects that this latter approach cannot tease out baseline rates of response from the treatment effects. Setting 2 has balanced baseline rates of response (50% each) on the outcome variable, but keeps the same 25% conversion to nonbaseline values in each, i.e., again no treatment effect. Note that this will give an identical expected value of the odds ratios for the two McNemar tests; both odds ratios have an expected value of one. The ANCOVA-like multiple regression approach again provides proper control over Type I error, with rates again in the 4-5% range. Here, the conditional logistic regression approach is overly conservative, with a Type I error rate less than half of the nominal rate. Setting 3 is as that for Setting 2, but with a 50% conversion rate. The ANCOVA- like approach gives Type I error rates in the 5-6% range. In contrast, the conditional logistic approach yields a rate of less than one-half of one percent, an order of magnitude lower than what it's supposed to be. The only difference from Setting 2 is the higher rate of across-the-board conversion, which induces a lower binomial correlation between pre- and posttreatment outcomes; thus, the p-value of the conditional logistic regression approach reflects the within-subject correlation. Setting 4 keeps the balanced baseline rates of the previous cases, but changes the rates of conversion differentially between treatment groups in order to provide a small treatment effect: a 10% conversion of baseline-negatives to positive posttreatment in the control treatment group, a 10% switching of baseline-positives to negative in this group, a 15% conversion of baseline- negatives to positive in the experimental treatment group and a 5% conversion of baseline-positives to negative in this group. Although I don't know what the two true-positive rate should be, the relative power of the two approaches can be assessed, since both should show the treatment (intervention) effect--a 0% net change in the control (nonintervention) treatment group and a net 10% conversion to positives in the experimental (intervention) treatment group. In this setting, both approaches display the same relative power, about 11-12% rejections of the null hypotheses. The test of the joint hypothesis, which is only doable in the ANCOVA-like approach, shows slightly higher relative power, about 15% rate of rejection, and probably reflects the differential switch rate between the two levels of baseline that occurs only at one level of the treatment factor. In any event, this enhanced power in the face of maintaining the level of Type I error rate (Settings 1 through 3) is another argument for using this joint-hypothesis as the default primary hypothesis. Setting 5 is even more dramatic in the differentiation of the treatment groups: the control treatment group remains as before with a 10% switchover at each level of baseline response for a net 0% change, but the difference is exaggerated for the experimental treatment group to a 25% switch from negative to positive and a 2.5% switch from positive to negative. Again, the relative power of the ANCOVA-like and conditional logistic approachs in this perfectly balanced-baseline case is similar, about 32-34% rejection rates. Again, the joint-hypothesis test is more powerful, about twice that of the main-effects- only hypothesis. Its concordance with the differential switching rates occuring differentially in one treatment group reinforces the argument for its primacy. Even the interaction-only hypothesis test failed to discern this situation reliably. Given that the false-positive (Type 1 error) rate for the conditional logistic regression approach is affected by baseline imbalance and by the gross rate of conversion (binomial correlation between observations), I conclude that results from this approach are uninterpretable for this type of study design. The ANCOVA-like multiple logistic regression approach, however, maintains the nominal level of Type I error, and has at least the power of the invalid conditional logistic regression approach, and is even perhaps a smidgen better. The results of the exercise follow immediately below, and the do-file follows afterward. Joseph Coveney ------------------------------------------------------------------------------- Means represent rates of declaring statistical significance at a nominal 5% level of Type 1 error rate pclo: Conditional logistic regression pant: ANCOVA-like, main effects of treatment pani: ANCOVA-like, treatment-by-baseline interaction panb: ANCOVA-like, treatment main effects & interaction Setting 1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- panb | 10000 .0426 .2019637 0 1 pant | 10000 .0483 .2144101 0 1 pani | 10000 .052 .2220381 0 1 pclo | 10000 .6355 .4813137 0 1 Setting 2 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- panb | 10000 .0404 .1969054 0 1 pant | 10000 .0426 .2019637 0 1 pani | 10000 .0485 .214831 0 1 pclo | 10000 .0213 .1443897 0 1 Setting 3 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- panb | 10000 .0476 .212929 0 1 pant | 10000 .0529 .223845 0 1 pani | 10000 .057 .2318542 0 1 pclo | 10000 .0048 .069119 0 1 Setting 4 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- panb | 10000 .1473 .3544224 0 1 pant | 10000 .1233 .3287977 0 1 pani | 10000 .0263 .160034 0 1 pclo | 10000 .1109 .314024 0 1 Setting 5 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- panb | 10000 .5955 .4908196 0 1 pant | 10000 .3372 .4727774 0 1 pani | 10000 .0147 .1203551 0 1 pclo | 10000 .3156 .4647776 0 1 ------------------------------------------------------------------------------- program define corlo1 version 8.2 replace dep0 = 1 in 67/166 generate byte dep1 = abs(dep0 - (uniform() > 0.75)) end program define corlo2 version 8.2 replace dep0 = 1 in 51/150 generate byte dep1 = abs(dep0 - (uniform() > 0.75)) end program define corlo3 version 8.2 replace dep0 = 1 in 51/150 generate byte dep1 = abs(dep0 - (uniform() > 0.5)) end program define corlo4 version 8.2 replace dep0 = 1 in 51/150 generate byte dep1 = dep0 replace dep1 = abs(dep0 - (uniform() > 0.90)) in 1/50 replace dep1 = abs(dep0 - (uniform() > 0.90)) in 51/100 replace dep1 = abs(dep0 - (uniform() > 0.85)) in 101/150 replace dep1 = abs(dep0 - (uniform() > 0.95)) in 151/l end program define corlo5 version 8.2 replace dep0 = 1 in 51/150 generate byte dep1 = dep0 replace dep1 = abs(dep0 - (uniform() > 0.90)) in 1/50 replace dep1 = abs(dep0 - (uniform() > 0.90)) in 51/100 replace dep1 = abs(dep0 - (uniform() > 0.75)) in 101/150 replace dep1 = abs(dep0 - (uniform() > 0.975)) in 151/l end program define corlo, rclass version 8.2 syntax , setting(integer) drop _all set obs 200 generate byte dep0 = 0 corlo`setting' generate byte trt = _n > _N / 2 generate byte iac = trt * dep0 logistic dep1 dep0 trt iac, nolog test trt iac return scalar anb = r(p) test trt return scalar ant = r(p) test iac return scalar ani = r(p) generate int pid = _n reshape long dep, i(pid) j(per) replace iac = trt * per xtlogit dep trt per iac, i(pid) fe nolog test iac return scalar clo = r(p) end program define runem version 8.2 clear set more off set seed 20040211 display display as text "Means represent rates of declaring statistical" display as text " significance at a nominal 5% level of Type 1 error rate" display display as text "pclo: Conditional logistic regression" display as text "pant: ANCOVA-like, main effects of treatment" display as text "pani: ANCOVA-like, treatment-by-baseline interaction" display as text "panb: ANCOVA-like, treatment main effects & interaction" display forvalues scenario = 1/5 { display as input "Setting `scenario'" quietly simulate "corlo, setting(`scenario')" anb = r(anb) /// ant = r(ant) ani = r(ani) clo = r(clo), reps(10000) foreach var of varlist _all { generate byte p`var' = `var' < 0.05 } summarize p* display } end runem exit * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Comparing change in rates - frustrating problem: questionable results***From:*Ricardo Ovaldia <ovaldia@yahoo.com>

- Prev by Date:
**st: constrainted regression** - Next by Date:
**st: RE: Re: Normality Testing** - Previous by thread:
**st: constrainted regression** - Next by thread:
**Re: st: Comparing change in rates - frustrating problem: questionable results** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |