Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: Mysterious Output Given by GLLAMM for Multiple-Equation Generalized Linear Model


From   "Joseph Coveney" <jcoveney@bigplanet.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Re: Mysterious Output Given by GLLAMM for Multiple-Equation Generalized Linear Model
Date   Sun, 15 Jul 2012 16:30:20 +0900

Weihao Yin wrote:

I am learning to use GLLAMM to estimate a multiple-equation Generalized
Linear Model, which jointly models binary and count data. The  dataset I use
is about the condition and length of hospital stay for 32 herniorrhaphy
patients. The dataset is freely available online. The response variables are
[leave] and [los], which denote the condition of the patient upon leaving
the operating room and the length of hospital stay after the operation. One
of the two covariate is [OKstatus] which distinguishes patients based on
their post-operative physical status with "1" indicating better status. The
other is patient's age. 

The model consists of two equations: one is logit model for [leave] and the
other is a Poisson model for [los]. The correlation between the two is
captured using a common random effect. Mathematically, it is written as:

g1([leave]) = b01+b11*[age]+b12*[OKstatus]+u+epsilon1
g2([los])   = b02+b12*[age]+b22*[OKstatus]+u+epsilon2

where u is the random effect and epsilon1&2 are errors. This example is also
used in the SAS GLIMMIX procedure documentation (pp. 203-209). I just try to
reproduce the results using GLLAMM. Since the random effect is used as an
intercept, I use a constraint to make the factor loadings for both equations
on the random effect equal to 1. The GLLAMM gives the following output, in
which [d1] is the dummy variable for logit and [d2] is Poisson. 

[output redacted]

The output generally is the same as given by the SAS GLIMMIX. However, when
I rerun the model using "allc" option to list all the estimated parameters,
here is what I got. 

[output redacted]

The last parameter [pat1_1]d1 = 0.5281017. Does anyone know what this
parameter is? The first loading "d1" is supposed to be 1 and it is according
to the previous output. By the name of it, it seems to be the factor
loading. Does GLLAMM estimate the variance of the two epsilons? If it does
not, how can I calculate the correlation between the two responses, which is
the whole point of this?

I know it is a long post but I really need some help. Any input is greatly
appreciated. Thanks!

--------------------------------------------------------------------------------

I'll try to answer your three questions in turn:

1. [pat1_1]d1 is the variance of the random effect (latent factor) in the form
in which it is parameterized during the fitting procedure.  For numerical
stability reasons and to assure that it is at least positive semidefinite,
-gllamm- parameterizes the level-2 variance matrix as a Cholesky decomposition.
Because you have only a single variance component, this reduces to a scalar
instead of a matrix, and [pat1_1]d1 is therefore the square root of the variance
of the random effect.  (See the do-file below for illustration.)

2. I believe that -gllamm- doesn't estimate variances for any epsilons inside
the linear predictor equation for one-parameter distributions, such as binomial
and Poisson.  It won't estimate separate variances of level-1 epsilons in linear
models, either, unless you use the -s()- option.

3. You don't really compute the intraclass correlation coefficient when you have
a Poisson response variable.  You can point to the nonzero variance of the
random effect as evidence that the binary and count response variables are
substantially correlated.

As an aside, based upon what I infer your command line must have been (see first
-gllamm- syntax in the do-file below), you could have specified the same model a
bit more simply.  With this model, you can omit the factor-loading equation and
the subsequently necessitated constraint on the second loading.  (See the second
syntax in the do-file below.)

Joseph Coveney

drop _all
input byte patient byte age str1 gender ///
    byte OKstatus byte leave byte los

[dataset omitted for brevity--see
http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm
#statug_glimmix_sect017.htm
for its listing]

end

rename leave response1
rename los response2
quietly reshape long response, i(patient) j(dist)

quietly tabulate dist, generate(d)

generate byte d1Xage = d1 * age
generate byte d2Xage = d2 * age
generate byte d1XOKstatus = d1 * OKstatus
generate byte d2XOKstatus = d2 * OKstatus

*
* What I assume your model is
*
eq manifest: d1 d2
constraint define 1 [pat1_1l]d2 = 1

gllamm response d1 d2 d1Xage d2Xage ///
    d1XOKstatus d2XOKstatus, ///
    noconstant i(patient) ///
    eqs(manifest) constraints(1) ///
    family(binomial poisson) fv(dist) ///
    link(logit log) lv(dist) /// <- optional
    adapt allc nolog

// What [pat1_1]d1 is . . .
display in smcl as text _b[pat1_1:d1]^2
// cf. "level2 (patient) var(1): .27889147"

*
* Same model, slightly simpler syntax
*
gllamm response d1 d2 d1Xage d2Xage ///
    d1XOKstatus d2XOKstatus, ///
    noconstant i(patient) ///
    family(binomial poisson) fv(dist) ///
    link(logit log) lv(dist) /// <- optional
    adapt allc nolog

exit


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index