[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Ricardo Ovaldia <ovaldia@yahoo.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Re: xtlogit and logistic-cluster (REVISED) |

Date |
Thu, 12 Aug 2004 05:31:39 -0700 (PDT) |

Thank you Joseph. I appreciate your assistance very much. Thank you not only for your valuable comments, but also for your patience. Ricardo. --- Joseph Coveney <jcoveney@bigplanet.com> wrote: > Ricardo Ovaldia wrote: > > > I am a bit baffled by the assertion that 50 > clusters > > and 410 observations is a small sample size. I > know is > > not big, but I would not consider it small either. > > Whether 50 clusters and 410 total observations is > small or not depends upon > the task. Advocating exercising caution to assure > that the sample size is > adequate for the intended purpose is not asserting > that a particular sample > size is small. For population-average GEE, which is > sensitive to cluster > numbers, rules of thumb for sample size for ranges > of predictors are given > in M. E. Stokes, C. S. Davis & G. G. Koch, > _Categorical Data Analysis Using > the SAS System_ Second Edition. (Cary: N. Carolina: > SAS Institute, 2000), > p. 479. If you have many candidate predictors among > those for patients and > physicians, my guess is that the authors would say > that 50 clusters is > pretty dicey. > > I don't recall having recently run accross any > corresponding guidance for > random-effects logistic regression, which depends > more upon within-cluster > correlation and total observations. Can -simulate- > tell you about the > adequacy of the sample size for your purposes (e.g., > for confidence interval > coverage) in your particular dataset with the > parameters set at their > estimates? Generating a correlated binary variate > to match the observed rho > is tough, but you might be able to get reasonably > close. If you're > satisfied with the results of the simulation for the > model's intended use, > then the sample size is not too small. > > In a simple-minded illustration below, a sample size > of 50 clusters, a > uniform length (cluster size) of six observations > and a moderate-to-high > within-cluster correlation (rho is about 80% or so), > the test size was 11.5% > at the nominal 5% level of Type 1 error rate. > That's more than double the > nominal, and if the purpose is hypothesis testing, > then the sample size > would be considered small, too small given the > nature of the data and the > objective. This improves, of course, when there is > no within-cluster > correlation--in the simple example below it reduces > to 6.7%, which is still > substantially larger than nominal. But if this > isn't critical for the > objective, then the sample then would not > necessarily be considered small. > > > The question posed in this phase of analysis is > rather > > simple: Which physician and patient > characteristics > > are important in predicting patient referral? > > Have you considered coupling modeling with graphical > analysis at this phase? > Strength and nature of the relationships observed > graphically could be > combined with knowledge of the subject matter to > judge importance of > predictors. Plots could be made of observations or > of predictions from > models after holding one or more covariates at > reference values. If your > audience doesn't feel comfortable judging the > strength or importance of the > relationship based upon what they can see by > graphical presentation, then > numerical description of the predictions can be done > either with summary > statistics (including tabulations) or by a model, > perhaps with standardized > coefficients if that makes it easier for your > audience. For the next phase, > the model can be made parsimonious based upon what's > observed in the plots > or what's judged unimportant in earlier stages of > exploration. It might be > beneficial to use two models to describe your > observations: one, a > conditional logistic regression with physicians as > groups, to describe > patient characteristics that predict referral; the > other, a count model, to > describe physician characteristics that predict > referral rates. > > Joseph Coveney > > ---------------------------------------------------------------------------- > > clear > set more off > set seed 20040809 > set obs 6 > forvalues i = 1/6 { > generate float rho`i' = 0.8 > replace rho`i' = 1 in `i' > } > mkmat rho*, matrix(A) > * > program define xtlogitsimc, rclass > version 8.2 > drawnorm dep1 dep2 dep3 dep4 dep5 dep6, corr(A) > n(50) clear > generate byte pid = _n > generate byte trt = _n > _N / 2 > reshape long dep, i(pid) j(tim) > replace dep = dep > 0 > compress > xi: xtlogit dep trt i.tim, i(pid) re > estimates store A > xtlogit dep, i(pid) re > estimates store B > lrtest A B > return scalar p = r(p) > end > * > simulate "xtlogitsimc" p = r(p), reps(1000) > generate byte pos = p < 0.05 > replace pos = . if p >= . > summarize pos > * > * > * > program define xtlogitsimi, rclass > version 8.2 > replace dep = uniform() > 0.5 > xi: xtlogit dep trt i.tim, i(pid) re > estimates store A > xtlogit dep, i(pid) re > estimates store B > lrtest A B > return scalar p = r(p) > estimates drop _all > end > * > clear > set obs 50 > generate byte pid = _n > generate byte trt = _n > _N / 2 > forvalues i = 1/6 { > generate byte dep`i' = . > } > reshape long dep, i(pid) j(tim) > simulate "xtlogitsimi" p = r(p), reps(1000) > generate byte pos = p < 0.05 > replace pos = . if p >= . > summarize pos > exit > > > > > > * > * For searches and help try: > * > http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > ===== Ricardo Ovaldia, MS Statistician Oklahoma City, OK __________________________________ Do you Yahoo!? New and Improved Yahoo! Mail - 100MB free storage! http://promotions.yahoo.com/new_mail * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Re: xtlogit and logistic-cluster (REVISED)***From:*Joseph Coveney <jcoveney@bigplanet.com>

- Prev by Date:
**st: -onewayplot- update on SSC** - Next by Date:
**st: Re: mvsumm with missing obs** - Previous by thread:
**st: Re: xtlogit and logistic-cluster (REVISED)** - Next by thread:
**st: matching script for design of case-control data?** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |