Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: Correlation of repeated baseline measures in sampsi


From   "Seed, Paul" <[email protected]>
To   "[email protected]" <[email protected]>, "[email protected]" <[email protected]>
Subject   st: Re: Correlation of repeated baseline measures in sampsi
Date   Thu, 24 Mar 2011 11:16:30 +0000

Helen Connolly wrote:

I'm using the sampsi command to calculate sample size for a two-sample
study with 12 monthly baseline measurements and 12 monthly follow-up
measurements.  I have data from a previous study and can calculate mean
and standard deviations for both samples.

My question is, how do I calculate correlation of baseline measurements
(r0), correlation of follow-up measurements (r1), and correlation
between baseline and follow-up measurements (r01)?  These require a
single measure (not a covariance matrix).  In all the references I have
seen, there are statements like "correlation of baseline measurements
calculated from previous study", but I see no reference as to how this
is done.  Can someone please help?

*********************************************

This portion of -sampsi- is derived from Frison L & Pocock SJ (1992) 
The original paper assumed that only limited data was available from the 
sample dataset - means, SDs, and correlations.  There are more choices 
when you have the full data set to try out.

I wrote this part of the command in 1997 as -sampsi2-, and it was 
incorporated as official Stata shortly afterwards. 

The correlations between repeated measurements can be found 
by setting the data in wide format and carrying out correlations
between the repeated measures.  

First, get the full correlation matrix.
You divide this into 3 sets of correlations. Set R00 contains only correlations 
between two baseline measures. Set R01 is  between one baseline 
& one follow-up measure, and set R11 is between follow-up measures only.
If any set has only one member, your problem is solved.  Likewise if the set is empty; 
do not use that option.  Otherwise, Frison & Pocock advise using the average 
(simple arithmetic mean) of the correlations in a set, assuming they are not too far apart.

An alternative approach is to average all the baseline measures and follow-up measures 
in you example data set, and use only the one correlation between baseline & follow-up.
You Standard Deviation for the outcome measures should also be adjusted.

References:
As Stata manual


**************** Example *************************

* Create a suitable dataset
use http://www.stata-press.com/data/r11/nlswork, clear
xtdes
tab year

keep if year < =73
keep if ln_wage <.
bys id: keep if _N == 6
tab year

keep  ln_wage idcode year
reshape wide  ln_wage, i( idcode) j(year)

* Obtain the full matrix
corr ln*
* Assuming 2 baseline & 4 follow-up measures, 
* R00 = { 0.6677}, R01 = { 0.6320 0.8301 0.5863 0.7513 0.5517 0.6687}
* and R11 = { 0.7981   0.7143   0.8117   0.6504   0.7622   0.8779}
di (0.6320 + 0.8301 + 0.5863 + 0.7513 + 0.5517 + 0.6687)/6
di ( 0.7981 +0.7143 +0.8117   +0.6504   +0.7622   +0.8779)/6
* So, r00 = 0.6677, r01 is about 0.670, and r11 is 0.679

* Or, you can work with the averages of the individual values, 
* so that you have only one correlation to find.

egen  ln_wage_bl = rmean( ln_wage68 ln_wage69)
egen  ln_wage_fup  = rmean(  ln_wage70 ln_wage71 ln_wage72 ln_wage73)
corr  ln_wage_bl ln_wage_fup
su ln_wage_fup

exit
********************************************************


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index