Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

R: st: baseline adjustment in linear mixed models


From   Formoso Giulio <GFormoso@regione.emilia-romagna.it>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   R: st: baseline adjustment in linear mixed models
Date   Thu, 14 Feb 2013 10:19:43 +0000

Thank you very much for your opinion and for the arguments and clarity supporting it! I'm impressed about how good is the exchange of opinions in statalist

Sincerely, Giulio

-----Messaggio originale-----
Da: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di Seed, Paul
Inviato: mercoledì 13 febbraio 2013 19:52
A: statalist@hsphsun2.harvard.edu
Oggetto: Re: st: baseline adjustment in linear mixed models

Clyde Schechter clearly explains the algebra behind two ways of analysing intervention studies with observations at baseline (time=0), and one or more later time points.
He compares
(method 1) -xtmixed- with all time points considered equally, (method 2) ANCOVA-type models in which baseline is treated as a covariate. 

The models are very similar, but Method 2 involves an extra parameter, relating to the correlation between baseline measures and other. 

He suggests that method 1 is superior because it makes no distinction between time points, but with method 2 " one is, in effect, distinguishing the baseline observations from all others and saying that it (or something for which it is a proxy) exerts special influence over all the other observations that the other observations do not exert over each other."

I suggest that this is not accurate.  
With method 2, depending on the value of the extra parameter, baseline may be found to have a larger, a smaller or an equal influence.  With method 1, an equal influence is the only option.

If equal influence was correct, the two methods would differ trivially, due to the one lost degree of freedom corresponding to the extra parameter.
But otherwise, method 2 would be superior.

In fact,  there is evidence that in real data sets baseline measurements do have a smaller influence, with lower correlations with measurements taken at other times; e.g. Frison & Pocock (1992), particularly when summary scores are used.  The authors suggest that this is due to the greater time-gap involved.  

There is a second reason for treating baseline measurements separately.
In an intervention trial, it is common to find higher variance in the outcome for patients on treatment than at baseline or on placebo; due in large part to non-compliance (not taking the tablets).  
No reference to hand for this, but I've found it true in several studies I have worked on.

If baseline measurements are not included in the outcome (and not otherwise), all the measurements with the higher SD have the same subject ids; the unequal variance can be accommodated simply by using robust standard errors, clustered by subject id.  

Both these real complications can perhaps be accommodated within Clyde's preferred -xtmixed- framework, but perhaps not simply.


Reference:
Frison L & Pocock SJ (1992). Repeated measures in clinical trials: analysis using mean 
summary statistics and its implications for design. Statistics in Medicine; 11: 1685-1704

Best wishes, 


Paul T Seed, Senior Lecturer in Medical Statistics, 
Division of Women's Health, King's College London
Women's Health Academic Centre, King's Health Partners 
(+44) (0) 20 7188 3642.



*****************************************
Clyde Schechter wrote:
*****************************************

Giulio Formoso raises a question that comes up from time to time on Statalist: he plans to do a linear mixed model analysis of repeated-observations on a sample of units of observation, and asks if it is appropriate to include the baseline outcome value as a covariate.

Back to basics.  Let's think about a very simple statistical model that could be analyzed with the command:

- -xtmixed y || participant: -

with no independent variables.  And let's assume that there are 2 observations for each participant.  In equation form, this model is:

y_ij = mu + u_i + eps_ij, where i indexes participants, j = 1,2 indexes observations.  The standard assumptions are the u_i ~ N(0, sig_u), eps_ij ~ N(0, sig_e), iid.  From this, we can deduce that y_i1 and y_i2 have a joint bivariate normal distribution with mean mu and variance V = sig_u^2 + sig_e^2, and correlation r = sig_u^2/(sig_u^2 + sig_e^2).



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index