Alex Gamma wrote: I have longitudinal data from an age cohort of 591 people at 6 time-points over 20 years. I have two psychiatric diagnoses A and B, and I want to look at the question of whether prior occurence of A predicts current A or B and vice versa (i.e. whether prior B predicts current B or A). So I constructed additional variables A_prior and B_prior coding for any prior occurence of diagnosis A or B, and I ran the models xtgee A A_prior B B_prior some_covariates, i(id) fam(bin) link(logit) corr(exch) robust xtgee B B_prior A A_prior some_covariates, i(id) fam(bin) link(logit) corr(exch) robust Two questions: 1) A sociologist colleague doubted that it is valid to include this kind of "any prior occurence" variable, or indeed any lag-variable into the GEE model, but I don't see any reason as to why not. But just to be sure I checked with the experts before publishing these models: is he right? 2) If the A_prior and B_prior variables are admissible, is it correct to use an exchangeable correlation structure or should I use an independent structure? (this is what J.W.R. Twisk seems to recommend in "Applied Longitudinal Data Analysis for Epidemiology", 2003, Cambridge University Press). -------------------------------------------------------------------------------- Take a look at Chapter 12 (Time-dependent covariates) in P. J. Diggle, P. J. Heagerty, K-Y. Liang and S. L. Zeger, _Analysis of Longitudinal Data_ Second Edition. (Oxford: Oxford Univ. Press, 2002), pp. 245-81. It describes the use of lagged variables in a models fit by GEE for, for example, covariate endogeneity. According to the same source, at least for cross-sectional analysis, you should use an independence working correlation, unless you can satisfy the "full covariate conditional mean assumption," which the authors describe. Joseph Coveney * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

