[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: xtmixed syntax
Re: st: xtmixed syntax
Thu, 22 May 2008 14:51:42 -0500
David Airey <email@example.com> asks about how to obtain variance
components by using -xtmixed- for an experimental design with both nested and
> I have a variance components mixed model that I am have trouble with
> syntax using xtmixed. I'm interested in a genetic polymorphism effect
> (think of this as simple treatment effect) on gene expression, where
> gene expression is measured on set of genes where genes are actually
> measured by multiple probes.
> two group allelic fixed effect A
> random sample effect S, one sample per subject, subjects nested in
> allelic group, 80 subjects
> random gene effect G, measured on all samples, so crossed with group,
> typically 20-50 genes
> random probe effect P, nested in gene, < 5 probes per gene, different
> probes per gene
Before demonstrating the actual syntax of -xtmixed- for David's example, let
me first briefly discuss a more general idea behind using -xtmixed- with
random-effects experimental designs. A more detailed discussion is given in
Marchenko (2006) (in fact, David's example is similar to the example described
in section 4.6 of the article).
There are several ways of using -xtmixed- to obtain variance components from a
random-effects experimental design. One is what we call the brute-force way
of fitting the design. It is straightforward. There are also other
alternative ways of obtaining the same results. These approaches are more
efficient in terms of memory usage and speed but require using the alternate
formulation of the design as a multilevel model that may be difficult to find
for certain designs.
With the brute-force way (or design-matrix-based approach), the first step is
to construct the design matrix for random effects, corresponding to the
design. Such a random-effects design matrix will be comprised of columns
containing indicator variables corresponding to the levels of all random
effects. -xtmixed- accommodates the specification of this matrix via the
special group identifier -_all- and the factor notation -R._varname_- (see
[XT] xtmixed for details). Thus, all we need to do prior to using -xtmixed-
is to identify all random effects (or, equivalently, all variance components)
associated with the design.
Returning to David's example, probes are nested within genes, and groups are
crossed with genes and probes. Since the group effect is fixed, we have the
following variance components for this design: variability due to genes,
variability due to probes (nested within genes), variability due to the
interaction of groups and genes, variability due to the interaction of groups
with probes (nested within genes), and subject variability (residual
variance). Our four random effects are genes, probes (nested within genes),
group-gene interaction, and group-probe-(nested-within-genes) interaction.
Note that when a fixed factor is interacted with a random factor, the
interaction term is random.
Suppose that variables Group, Gene, and Probe contain information about
groups, genes, and probes, respectively. For this design, we also need to
create the following interaction terms:
. egen GeXPr = group(Gene Probe)
. egen GrXGe = group(Group Gene)
. egen GrXGeXPr = group(Group Gene Probe)
Now, the brute-force way of using -xtmixed- for this design is
straightforward: we simply list all of the random-effects variables as
separate equations using the -_all: R._varname_- notation. The corresponding
. xi: xtmixed depvar i.Group || _all: R.Gene || _all: R.GeXPr || _all: R.GrXGe ///
|| _all: R.GrXGeXPr, variance
In the above syntax, -_all: R.Gene-, for example, tells -xtmixed- to include
indicators corresponding to levels of variable Gene into the design matrix.
The variability within subjects (the lowest-level variability) is estimated by
the residual variance, reported by -xtmixed-.
The advantage of the brute-force way is that it is straightforward once the
relevant random-effects terms are identified. However, for David's example,
the above syntax requires creating a matrix with 50+50*5+2*50+2*50*5 = 900
columns (50 is the number of genes, 5 is the number of probes, and 2 is the
number of groups). We can avoid this by using the more efficient way of
obtaining the same results by using an alternative syntax of -xtmixed-.
First, consider the terms Gene, GrXGe, and GrXGeXPr. Since interaction terms
can be viewed as nested terms, we can obtain variance components for these
three terms more efficiently from a three-level model with genes defining the
first level, groups defining the second level, and probes defining the third
level. The corresponding syntax of -xtmixed- is
. xi: xtmixed depvar i.Group || Gene: || Group: || Probes:, variance
What we did not take into account in the above syntax is the fact that probes
are nested within genes. As described in example 7 of [XT] xtmixed and
section 4.4 of the cited article, the nesting can be accommodated by viewing
the levels of the nested effects (Probes) as random coefficients for nesting
level (Genes) and specifying the exchangeable covariance structure for these
random coefficients. Taking this into account, the final syntax for -xtmixed-
. xi: xtmixed depvar i.Group || Gene: R.Probes, cov(exchangeable) ///
|| Group: || Probes:, variance
By using the above syntax, we also significantly reduced the column dimension
of the design matrix (from 900 in the brute-force method to 5+1+1=7).
Marchenko, Y. 2006. Estimating variance components in Stata. The Stata
Journal, 6(1): 1-22.
The link to the article is
* For searches and help try: