 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

st: xtmixed versus spss mixed; random intercept only model

 From "Ploutz-Snyder, Robert (JSC-SK)[USRA]" To "statalist@hsphsun2.harvard.edu" Subject st: xtmixed versus spss mixed; random intercept only model Date Thu, 23 Sep 2010 12:49:13 -0500

I'm trying to understand why Stata and SPSS provide slightly (or not so much) different results when analyzing the same data with the same (I believe) model.  I'm trying to tie out the two software results, beginning with a very SIMPLE model, then working up to what I'd really like to do.  Since I've already ran into differences on a very simple model, I thought to ping the list to see if I'm going down the wrong path??

Machine: 64 bit Windows XP 2003 (service pack 2)
Stata: 11.1, completely up to date
SPSS: 19 (latest version, and the first version with the IBM logo)

Model:  very simple xtmixed model with continuous outcome (y).  All subjects participate in each of two different treatments (trmt), and y is recorded at several different time points during treatment--but for now let's assume time is continuous/linear (i.e. treat time effect as linear on y).

For starters, let us also consider only the random intercept model:

xtmixed y c.time##i.trmt||subj:, cov(id)

Performing EM optimization:

Iteration 0:   log restricted-likelihood =   267.6765
Iteration 1:   log restricted-likelihood =   267.6765

Computing standard errors:

Mixed-effects REML regression                   Number of obs      =       242
Group variable: subject                         Number of groups   =        11

Obs per group: min =        22
avg =      22.0
max =        22

Wald chi2(3)       =    315.17
Log restricted-likelihood =   267.6765          Prob > chi2        =    0.0000

------------------------------------------------------------------------------
y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
time |   -.000838    .000067   -12.50   0.000    -.0009694   -.0007066
1.trmt |  -.0700826   .0168249    -4.17   0.000    -.1030587   -.0371065
|
trmt#c.time |
1  |   .0000997   .0000948     1.05   0.293    -.0000861    .0002855
|
_cons |   1.344298   .0156008    86.17   0.000     1.313721    1.374874
------------------------------------------------------------------------------

------------------------------------------------------------------------------
Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
subject: Identity            |
sd(_cons) |   .0334711   .0089756      .0197884    .0566147
-----------------------------+------------------------------------------------
sd(Residual) |   .0699512   .0032758      .0638166    .0766754
------------------------------------------------------------------------------
LR test vs. linear regression: chibar2(01) =    27.71 Prob >= chibar2 = 0.0000

Given only a random intercept, the above Stata results do not change when I switch the cov option.

IN SPSS, with what I believe to be the same model, my results are (lots of defaults left in the command syntax fyi, but not really necessary to run the model):

MIXED y BY trmt WITH time
/FIXED=trmt time trmt*time | SSTYPE(3)
/METHOD=REML
/PRINT=CPS G  SOLUTION TESTCOV
/RANDOM=INTERCEPT | SUBJECT(subject) COVTYPE(id).

Above SPSS code shows that I'm treating trmt as a "Factor" and time as a "Covariate," which is the SPSS equivalent of Stata's use if i.trmt and c.time.   Note that I am also assuming an identity covariance structure in the random statement, as I did with Stata.  Both models use REML...  here are parts of the SPSS results, minus the pretty formatting and some stuff that we cannot directly compare to above Stata:

-2 Restricted Log Likelihood	-535.353

Estimates of Fixed Effects(b)
95% Confidence Interval
Parameter		Estimate	Std. Error	df		t		Sig.	Lower Bound	Upper Bound
Intercept		1.274215	.015601	38.104	81.676	.000	1.242636	1.305794

[trmt=0]		.070083	.016825	228.000	4.165		.000	.036931	.103235
[trmt=1]		0a	0	.	.	.	.	.

time			-.000738	.000067	228.000	-11.014	.000	-.000870	-.000606
[trmt=0] * time	-.000100	.000095	228.000	-1.052	.294	-.000287	.000087
[trmt=1] * time		0a	0	.	.	.	.	.

a. This parameter is set to zero because it is redundant.
b. Dependent Variable: y.

Estimates of Covariance Parameters(a)
95% Confidence Interval
Parameter					Estimate	Std. Error	Wald Z	Sig.	Lower Bound	Upper Bound
Residual					.004893	.000458	10.677	.000	.004073	.005879
Intercept [subject =	Variance	.001120	.000601	1.865		.062	.000392	.003205
subject]
a. Dependent Variable: y.

In comparing Stata to SPSS respective differences include:

Estimates for trmt effects are the same:	-.0700826   versus 	.070083 (Stata, SPSS)
(they differ only by the sign--clearly due to how each treats 1/0 indicators...)

The estimates for the Y-intercept: 		1.344298	versus	1.274215
The estimate for the time effect:		-.000838    versus	-.000738
The treatment by time interaction coeff: 	.0000997	versus	-.000100

If we are only considering tests of significance, obviously both packages tell the same story. But I'm curious why we get these differences, particularly for the time effect, where the first non-zero digits differ, and the SE's of those estimates are similarly very small numbers.  Y-intercept differences may be non-trivial??

Any thoughts/ideas?  Have I miss-specified the model with either SPSS or Stata??

Rob

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/