st: analysis of multi-site studies

Dear Statalist,

I recently analyzed some data from multi-site study on
prematurety. Data was available from 5 OB-GYN clinics
across the US. Epidemiologic data was collected on
cases and control mothers via a telephone interview
using the same questionnaire. There are lots of
differences across centers in both the composition of
participants (for example race) and also in the rates
of exposure variables (for example alcohol use during

In a preliminary study, we were interested in simply
examining the relationship between alcohol use
(yes/no) during pregnancy and a preterm delivery.  I
use -clogit- grouping on study site to model the

When presenting this data to a group of collaborators,
I was asked several questions:
1. Do you really need to adjust for study site?
2.Why conditional logistic regression and not just
simply use dummy variables?
3.How about using random effect logistic regression?

I replied that (1) we should adjust for study site
because the distributions of participant
characteristics vary by site. (2) Conditional logistic
and using dummy (indicator) variables produce the same
results and (3) I did not know how to answer, other
than I think that because there are only 5 sites
random effect logistic regression may not produce
consistent OR and SE estimates.

I have looked in the literature for information
regarding adjusting for site/center in multicenter
epidemiologic studies and found very little. Most of
the information available concerns randomized clinical
trials but not observational epidemiology. Although
some of the issues overlap there are also lots that do

So I have two questions: 1. were my replies to the
collaborators correct? 2. Does anyone know of any
literature that deals with adjusting for study site in
epidemiologic studies?

Thank you in advance,

Ricardo Ovaldia, MS
Oklahoma City, OK

