Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: baseline adjustment in linear mixed models


From   "Seed, Paul" <paul.seed@kcl.ac.uk>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: baseline adjustment in linear mixed models
Date   Wed, 13 Feb 2013 18:51:40 +0000

Clyde Schechter clearly explains the algebra behind two ways of analysing 
intervention studies with observations at baseline (time=0), and one or more later time points.
He compares 
(method 1) -xtmixed- with all time points considered equally, 
(method 2) ANCOVA-type models in which baseline is treated as a covariate. 

The models are very similar, but Method 2 involves an extra parameter, 
relating to the correlation between baseline measures and other. 

He suggests that method 1 is superior because it makes no distinction
between time points, but with method 2 " one is, in effect, distinguishing 
the baseline observations from all others and saying that it (or something 
for which it is a proxy) exerts special influence over all the other 
observations that the other observations do not exert over each other."

I suggest that this is not accurate.  
With method 2, depending on the value of the extra parameter, baseline may be 
found to have a larger, a smaller or an equal influence.  With method 1, 
an equal influence is the only option.

If equal influence was correct, the two methods would differ trivially, 
due to the one lost degree of freedom corresponding to the extra parameter.
But otherwise, method 2 would be superior.

In fact,  there is evidence that in real data sets baseline measurements 
do have a smaller influence, with lower correlations with measurements 
taken at other times; e.g. Frison & Pocock (1992), particularly when 
summary scores are used.  The authors suggest that this is due to the 
greater time-gap involved.  

There is a second reason for treating baseline measurements separately.
In an intervention trial, it is common to find higher 
variance in the outcome for patients on treatment than at baseline or 
on placebo; due in large part to non-compliance (not taking the tablets).  
No reference to hand for this, but I've found it true in several studies 
I have worked on.

If baseline measurements are not included in the outcome (and not otherwise), 
all the measurements with the higher SD have the same subject ids;  
the unequal variance can be accommodated simply by using robust standard errors, 
clustered by subject id.  

Both these real complications can perhaps be accommodated within 
Clyde's preferred -xtmixed- framework, but perhaps not simply.


Reference:
Frison L & Pocock SJ (1992). Repeated measures in clinical trials: analysis using mean 
summary statistics and its implications for design. Statistics in Medicine; 11: 1685-1704

Best wishes, 


Paul T Seed, Senior Lecturer in Medical Statistics, 
Division of Women's Health, King's College London
Women's Health Academic Centre, King's Health Partners 
(+44) (0) 20 7188 3642.



*****************************************
Clyde Schechter wrote:
*****************************************

Giulio Formoso raises a question that comes up from time to time on Statalist: he plans to do a linear mixed model analysis of repeated-observations on a sample of units of observation, and asks if it is appropriate to include the baseline outcome value as a covariate.

Back to basics.  Let's think about a very simple statistical model that could be analyzed with the command:

- -xtmixed y || participant: -

with no independent variables.  And let's assume that there are 2 observations for each participant.  In equation form, this model is:

y_ij = mu + u_i + eps_ij, where i indexes participants, j = 1,2 indexes observations.  The standard assumptions are the u_i ~ N(0, sig_u), eps_ij ~ N(0, sig_e), iid.  From this, we can deduce that y_i1 and y_i2 have a joint bivariate normal distribution with mean mu and variance V = sig_u^2 + sig_e^2, and correlation r = sig_u^2/(sig_u^2 + sig_e^2).



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index