Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Testing dependence in a 2x2 table for clustered observations

From	Steve Samuels <[email protected]>
To	[email protected]
Subject	Re: st: RE: Testing dependence in a 2x2 table for clustered observations
Date	Fri, 27 Aug 2010 09:40:41 -0400

--

The representation of the  MH OR as a weighted estimate of the stratum
specific OR's breaks down for  strata with zero cells.  In fact,
strata with _two_ empty cells can contribute to the MH estimator  and
to fixed effects (conditional) logistic regression, as long the
occupied cells are in opposite corners.  This extreme case is found in
 1-1 matched pair studies, with each pair considered a stratum.  Only
pairs discordant for the "exposure" contribute to the analysis.

There was no error in Paul's code. I did not notice that each
simulated centre  had two rows of data.

Steve


On Fri, Aug 27, 2010 at 6:25 AM, Seed, Paul <[email protected]> wrote:
> To my surprise & chagrin, Steve Samuels is absolutely right about
> the Mantel-Haenszel method (at least as far as centres
> with a single missing cell goes). Even though the Stata output
> indicates that they have a zero weight,
>  they are not irrelevant.
> Excluding them has a dramatic effect on the both the value
> and the accuracy of the estimated odds ratio.
>
> Using the same example, the combined odds ratio are :
> (all data) .92 (CI .42 to 2.0)
> (non-zero weights only) .42 (CI .16 to 1.10)
>
> There are similar effects on the odds ratios using the
> fixed effect & random effect methods.
>
> However, if the centre is missing an entire row (or column),
> excluding it has no effect on the Mantel-Haenszel of
> fixed effect answers; so (as I suggested), the random effect
> method would make more data available & give greater
> accuracy, assuming, as Steve says, the assumptions are valid.
>
>
> ***************************
> cs recovered treatment , by(clinic) or
> cs recovered treatment if !zero_wt, by(clinic) or
> cs recovered treatment if !no_recov, by(clinic) or
> ***************************
>
> However, the code does indeed give approximately
> 8 observations per centre, as I intended
> (168 observations, 20 centres).  Multiplying by -runiform()-
> does divide by 2 (on average), but Steve perhaps
> did not notice that there are two lines of data per centre.
>
> Paul
> ----
>
>
> Date: Thu, 26 Aug 2010 12:00:56 -0400
> From: Steve Samuels <[email protected]>
> Subject: Re: st: RE: Testing dependence in a 2x2 table for clustered observations
> - --
>
> "Now, some small centres have an empty cell, and the data from that
> centre is lost if Mantel-Haenszel methods are used. "
>
> This is not correct. If the cells counts are a,b,c,d, centers with
> only one empty cell will contribute either (a x d) or (b x c) ,
> whichever is non-zero.  The MH method is often studied under the
> assumption that the data arise from a fixed-effects logistic
> regression, so it's not surprising that the results are similar.
>
> Random effects logistic regression has more assumptions than  the
> fixed effects model. I'm not expert in this area, but if the -re- and
> - -fe- options produce different results, I tend to believe -fe-.
>
> One might try to check the -re- assumptions, e.g.:
> ************************
> xtmelogit recovered treatment || clinic:
> predict pclinic, reffects level(clinic)
> egen ctag= tag(clinic)
> qnorm pclinic if ctag, mlab(n)
> ************************
>
> (By the way: There is a minor glitch in Paul's code. To get an average
> 8 observations per clinic, the sample size generation line would have
> to be:  gen n = int(runiform()*16+.5))
>
> Steve
>
> - -
> Steven Samuels
> [email protected]
> 18 Cantine's Island
> Saugerties NY 12477
> USA
> Voice: 845-246-0774
> Fax:    206-202-4783
>
> On Thu, Aug 26, 2010 at 6:43 AM, Seed, Paul <[email protected]> wrote:
>> Dear Statalist,
>>
>> Adriaan Hoogendoorn has outcome and treatment data (both binary) from
>> 20 centres.  As suggested by Joseph and Joseph, -xtlogit, i(clinic) fe- ,
>> -xtlogit, i(clinic) re- and -cs, by(clinic)- will all give useable estimates
>> for the combined odds ratio, given a fairly large numbers of subjects in
>> every centre, and a good (50%) recovery rate.
>>
>> I tried a more realistic simulation with fewer subjects per centre,
>> different-sized centres (168 subjects total instead of 2,000), and a
>> lower recovery rate (30% instead of 50%). Now, some small centres have
>> an empty cell, and the data from that centre is lost if Mantel-Haenszel
>> methods are used.  If there is only one outcome (two empty cells),
>> there will
>> be no estimated odds ratio for that centre, the centre is lost to
>> the fixed effects method as well.
>>
>> In the example below, 63 of 168 observations are lost to M-H and
>> 45 to fe.  None are lost to re.
>> However, more simulations would be needed to get a clearer picture
>> of the effect on the power and size of the tests.
>>
>> ****************************
>> clear *
>> set more off
>> set seed `=date("2010-08-26", "YMD")'
>> set obs 20
>>
>> generate byte clinic = _n
>>
>> expand 2
>> bys clinic: gen treatment = _n-1
>>
>> * Average of 8 observations per centre
>> gen n = int(runiform()*8+.5)
>> expand n
>>
>> * 20% recovery rate
>> gen recovered = runiform() < .2
>>
>> cs recovered treatment , by(clinic) or
>> mhodds recovered treatment , by(clinic)
>>
>> xtlogit recovered treatment, i(clinic) fe or nolog
>> xtlogit recovered treatment, i(clinic) re or nolog
>>
>> * Investigation of data problems
>> bys clinic : tab  treatment recovered
>> recode clinic 1 2 5 12 15 18 19 20 = 0, into(zero_wt)
>> replace  zero_wt =  zero_wt == 0
>> bys clinic (recovered) : gen no_recov =  recovered[_N] == 0
>> tab  zero_wt no_recov
>>
>>
>>
>> exit
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: RE: Testing dependence in a 2x2 table for clustered observations
  - From: "Seed, Paul" <[email protected]>

Prev by Date: Re: st: outliers
Next by Date: Re: st: outliers
Previous by thread: st: RE: Testing dependence in a 2x2 table for clustered observations
Next by thread: st: estout to Stata dataset
Index(es):
- Date
- Thread