# st: correlation - intraclass vs others

 From "Jacki Buros" <[email protected]> To <[email protected]> Subject st: correlation - intraclass vs others Date Fri, 8 Jul 2005 14:15:02 -0400

```Hi...

I'm relatively new to STATA and to statistical analyses in general. Am
rapidly attempting to educate myself. I have a question on correlations,
autocorrelation and intra-class correlation. The dataset is large with >
5000 subjects, each from one of 5 studies. We have measured 15-30 biomarkers
per subject in a longitudinal study design (5-6 timepoints). For now we are
considering only one, relatively equivalent across the studies (baseline).

The research question concerns the auto-correlation of biomarkers, to what
extent it occurs and in which patients. We have numerous hypotheses that
could operate within this context. From my admittedly limited experience, I
know of at least a few approaches. Problem is, I don't know enough to
differentiate between them! I also have a number of other issues ... (listed
below). I hope that someone on this list can assist.

Approach 1: Set up two logistic MV models, one for prognosis and the other
for diagnosis (using logistic). Generate correlation matrix for each model.

Approach 2: Run two sets (one for prognosis and one for diagnosis) of GLMIC
models (using loneway), each with one model per pair of biomarkers. Generate
for each set a matrix of significant intra-class correlation coefficients.

Issues/questions:
1) comparing each of the 2 tables (approach 1 vs approach 2), they look
different. Why would this be? Which approach is more accurate?
2) how to stratify approach 2 by study. Is it roughly-equivalent to
rank-order each biomarker value by study, using the ranks rather than the
values themselves?
3) in approach 2, given that the coefficient is roughly a measure of
within-group/across-group variance, is this comparison telling me what I
want to know (within-group of one vs within-group of another)? Would it be
better to compare the R-square values, or some other parameter?
4) How to interpret results of approach 1 in light of significant
auto-correlation between the markers. Can they 'knock one another out' of
the model? How would this affect the correlation matrix?
5) how to deal with interactions (i.e., glucose with diabetes, or
concomitant treatment with supplemental insulin)?
6) how to compare 2 test statistics (correlation coefficients), and
determine whether the difference between them is sufficiently non-random to
have value. Specifically, is it fair to use a t-test comparison as follows:
. ttesti 1 ccoef1 sterr1 1 ccoef2 sterr2, unequal

Any advice, references, etc for the issues listed above would be much
appreciated!! Sorry if this is wholly inappropriate for this forum...

Jacki

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```