Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: xtmixed syntax


From   David Airey <david.airey@vanderbilt.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: xtmixed syntax
Date   Thu, 22 May 2008 22:11:13 -0500

.

Thank you very much for this level of detail, Yulia. This is a terrific explanation. I now have your article too. I was far from the last syntax, but now with your explanations, I might be able to figure out similar syntax de novo. I will try out both approaches and report back (another person is tackling the same model with SAS Proc Mixed) so it will make a good comparison.

-Dave


On May 22, 2008, at 2:51 PM, ymarchenko@stata.com wrote:


David Airey <david.airey@vanderbilt.edu> asks about how to obtain variance
components by using -xtmixed- for an experimental design with both nested and
crossed factors:


I have a variance components mixed model that I am have trouble with
syntax using xtmixed. I'm interested in a genetic polymorphism effect
(think of this as simple treatment effect) on gene expression, where
gene expression is measured on set of genes where genes are actually
measured by multiple probes.

two group allelic fixed effect A
random sample effect S, one sample per subject, subjects nested in
allelic group, 80 subjects
random gene effect G, measured on all samples, so crossed with group,
typically 20-50 genes
random probe effect P, nested in gene, < 5 probes per gene, different
probes per gene

...
Before demonstrating the actual syntax of -xtmixed- for David's example, let
me first briefly discuss a more general idea behind using -xtmixed- with
random-effects experimental designs. A more detailed discussion is given in
Marchenko (2006) (in fact, David's example is similar to the example described
in section 4.6 of the article).

There are several ways of using -xtmixed- to obtain variance components from a
random-effects experimental design. One is what we call the brute- force way
of fitting the design. It is straightforward. There are also other
alternative ways of obtaining the same results. These approaches are more
efficient in terms of memory usage and speed but require using the alternate
formulation of the design as a multilevel model that may be difficult to find
for certain designs.

With the brute-force way (or design-matrix-based approach), the first step is
to construct the design matrix for random effects, corresponding to the
design. Such a random-effects design matrix will be comprised of columns
containing indicator variables corresponding to the levels of all random
effects. -xtmixed- accommodates the specification of this matrix via the
special group identifier -_all- and the factor notation - R._varname_- (see
[XT] xtmixed for details). Thus, all we need to do prior to using - xtmixed-
is to identify all random effects (or, equivalently, all variance components)
associated with the design.

Returning to David's example, probes are nested within genes, and groups are
crossed with genes and probes. Since the group effect is fixed, we have the
following variance components for this design: variability due to genes,
variability due to probes (nested within genes), variability due to the
interaction of groups and genes, variability due to the interaction of groups
with probes (nested within genes), and subject variability (residual
variance). Our four random effects are genes, probes (nested within genes),
group-gene interaction, and group-probe-(nested-within-genes) interaction.
Note that when a fixed factor is interacted with a random factor, the
interaction term is random.

Suppose that variables Group, Gene, and Probe contain information about
groups, genes, and probes, respectively. For this design, we also need to
create the following interaction terms:

. egen GeXPr = group(Gene Probe)
. egen GrXGe = group(Group Gene)
. egen GrXGeXPr = group(Group Gene Probe)

Now, the brute-force way of using -xtmixed- for this design is
straightforward: we simply list all of the random-effects variables as
separate equations using the -_all: R._varname_- notation. The corresponding
syntax is:

. xi: xtmixed depvar i.Group || _all: R.Gene || _all: R.GeXPr || _all: R.GrXGe ///
|| _all: R.GrXGeXPr, variance

In the above syntax, -_all: R.Gene-, for example, tells -xtmixed- to include
indicators corresponding to levels of variable Gene into the design matrix.
The variability within subjects (the lowest-level variability) is estimated by
the residual variance, reported by -xtmixed-.

The advantage of the brute-force way is that it is straightforward once the
relevant random-effects terms are identified. However, for David's example,
the above syntax requires creating a matrix with 50+50*5+2*50+2*50*5 = 900
columns (50 is the number of genes, 5 is the number of probes, and 2 is the
number of groups). We can avoid this by using the more efficient way of
obtaining the same results by using an alternative syntax of - xtmixed-.

First, consider the terms Gene, GrXGe, and GrXGeXPr. Since interaction terms
can be viewed as nested terms, we can obtain variance components for these
three terms more efficiently from a three-level model with genes defining the
first level, groups defining the second level, and probes defining the third
level. The corresponding syntax of -xtmixed- is

. xi: xtmixed depvar i.Group || Gene: || Group: || Probes:, variance

What we did not take into account in the above syntax is the fact that probes
are nested within genes. As described in example 7 of [XT] xtmixed and
section 4.4 of the cited article, the nesting can be accommodated by viewing
the levels of the nested effects (Probes) as random coefficients for nesting
level (Genes) and specifying the exchangeable covariance structure for these
random coefficients. Taking this into account, the final syntax for -xtmixed-
is

. xi: xtmixed depvar i.Group || Gene: R.Probes, cov(exchangeable) ///
|| Group: || Probes:, variance

By using the above syntax, we also significantly reduced the column dimension
of the design matrix (from 900 in the brute-force method to 5+1+1=7).

Reference:

Marchenko, Y. 2006. Estimating variance components in Stata. The Stata
Journal, 6(1): 1-22.

The link to the article is
http://www.stata-journal.com/article.html?article=st0095


-- Yulia
ymarchenko@stata.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index