Jonathan Bartlett <Jonathan.Bartlett@lshtm.ac.uk> asks about fitting the
following model:
> y(i12) = u(i2) - u(i1) + e(i12)
> y(i23) = u(i3) - u(i2) + e(i23)
where each subject (indexed by variable -id-) has two observations, modeled
as shown above.
He attempts to fit the model with
> xtmixed y || id: u1 u2 u3, cov(identity) noconstant
with appropriately formed dummy variables u1, u2, and u3. In doing so, he
runs into problems:
> but Stata then drops one of u1, u2, u3 because it says they are collinear
> (which is true). But the u(ij) terms in the model are not collinear (I don't
> think), which is why the model can be fitted in SAS.
Jonathan's model can be reparameterized into a form not requiring an
over-identified set of indicators. To see this, note that
y(i12) = u(i2) - u(i1) + e(i12) = u(i2) + v(i12)
y(i23) = -u(i2) + u(i3) + e(i23) = -u(i2) + v(i23)
where
v(i12) = -u(i1) + e(i12)
v(i23) = u(i3) + e(i23)
and so this model (or at least an equivalent model under an alternate
parameterization, yet producing the same log-likelihood) can be fitted with
xtmixed y || id: u2, nocons
where -u2- is as Jonathan has defined it.
We generated some data from Jonathan's model and tried the above solution,
producing
. xtmixed y || id: d2, nocons
Performing EM optimization:
Performing gradient-based optimization:
[...]
Computing standard errors:
Mixed-effects REML regression Number of obs = 2000
Group variable: id Number of groups = 1000
Obs per group: min = 2
avg = 2.0
max = 2
Wald chi2(0) = .
Log restricted-likelihood = -4659.3386 Prob > chi2 = .
------------------------------------------------------------------------------
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_cons | 1.0501 .0492242 21.33 0.000 .953622 1.146577
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Identity |
sd(d2) | 1.227254 .0841195 1.072978 1.403713
-----------------------------+------------------------------------------------
sd(Residual) | 2.201373 .0492488 2.106932 2.300046
------------------------------------------------------------------------------
LR test vs. linear regression: chibar2(01) = 57.83 Prob >= chibar2 = 0.0000
The variance component -sd(d2)- is the common random effect standard deviation
for the u's that Jonathan seeks. However, the -sd(Residual)- term is for the
v terms I defined, and not the e's from Jonathan's original model. Since
Var(v) = Var(u) + Var(e)
implies that
SD(e) = sqrt{Var(v) - Var(u)}
we can use -nlcom- to get this, namely
. nlcom sd_resid: sqrt(exp(2*[lnsig_e]_cons) - exp(2*[lns1_1_1]_cons))
sd_resid: sqrt(exp(2*[lnsig_e]_cons) - exp(2*[lns1_1_1]_cons))
------------------------------------------------------------------------------
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sd_resid | 1.827536 .1011408 18.07 0.000 1.629304 2.025769
------------------------------------------------------------------------------
Thus, Jonathan can obtain both variance components he requires.
That stated, the above also shows that the above model is identified, and
so it would be nice if -xtmixed- did not drop the collinear u-variable but
instead just allowed the estimation to go through. As a result, we plan to
add a -collinear- option to -xtmixed- that will allow one to bypass the
removal of collinear variables within random-effects equations.
--Bobby --Vince
rgutierrez@stata.com vwiggins@stata.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/