Re: st: Stata (gllamm) benchmarks for different platforms?

From   Buzz Burhans <>
Subject   Re: st: Stata (gllamm) benchmarks for different platforms?
Date   Sat, 24 Apr 2004 19:26:51 -0400


Buzz Burhans or Fred Wolfe:
Would you be willing to post the small dataset and the Stata code
(perhaps just Program 1) that you used in these experiments so that
other Stata users could compare the speed of their Stata setups with the
above data?  This might provide some helpful information to users
contemplating hardware upgrades.

John Hennen

Dear John,
I will send you a copy of the .do file and the data we used. However, I have some reservations about posting this for general use related to the following concerns:

1. The dataset Fred and I used is a small extract from a similar dataset I worked on. Much of my work involves such small datasets of repeated measures observations, generated from animal science / nutrition experiments, the datasets are far smaller than what many/most of the statlisters work with. It was similar to some models I had previously run, but was not precisely either a complete dataset or an exact replica of any models I really ran. It was intended to be an informal comparison for Fred and I. I did not spend a great deal of time considering whether this data or the model were really an optimum basis for comparison; nonetheless because it is a reasonable mimic of other work I had done, we went ahead and used it, again, as an informal comparison.

If one were to establish a dataset for benchmarking, it probably should have more thought given to what would be make a really appropriate dataset and sample model(s), and what other information the log of such an exercise should capture and what one should report back from the exercise. I offer this to you without having spent a lot of time on these considerations. I found that similar informal comparisons were useful to me when I was considering whether and/or how to invest in hardware to address my desire to continue working with this data in Stata using -gllamm-. I very much appreciated the help I received (similar "benchmark" comparisons) in making those comparisons, and I think the ability to have access to a way to make such comparisons is indeed very useful for any of us when we are examining our own needs. However, if such a 'comparison' dataset and or .do file were to be used as more than just such an "informal" comparison there should probably be more effort made to ensure that the data, and the .do file are as appropriate and useful as possible. I think this is a very worthwhile thing to have available, and perhaps Stata Corp or someone else can suggest a protocol and/or a repository for such datasets, exercises, and resulting information. Nonetheless I offer the dataset and .do file Fred and I used, with the above reservations. It is obviously most useful for observing relative comparisons, so please let us know what you find out.

2. I indeed found the comparative information useful, in fact necessary, for making my last hardware decision. I'd like to reciprocate for the assistance and information I received and offer whatever similar help I can to others. I am concerned, however, that the resulting information not be misused in the sense that it not be used to provide a basis or fuel for bashing or trashing or uncivil negative attacks on any hardware or software providers. I have a minor concern that sometimes, in our passion and enthusiasm or sometimes frustration for our own preferences for hardware, software, perceived needs etc. we statalisters sometimes are a bit too vehement in our assertions and declarations about how things should be. Conditioned on the premise that it will not be so misused, and my above comments on reservations about it's appropriateness, I am happy to make it available. Given the policy of the statalist not to permit attachments, I will send both the .do file and the dataset to you privately in a separate email.

By the way, I am travelling for the first half of this week, and will be unavailable from very early Monday AM until the end of the week should you have questions about the data or the .do file.

Buzz Burhans

Following this exchange, we both ran a small dataset and the same .do
file which ran two almost identical calls to -gllamm-. The difference
between Program1 and Program2 is that the exact same model is run in
both cases, with the single difference being that in Program1 there are
2 level 2 random effects, and in program 2 an additional random effect
was added so there are 3 random effects.
And then later:

These were our results:
1) Both machines produced exactly the same numeric results for the
GLLAMM programs.
2) The Linux machine was faster.

Program 1
Linux                   Windows XP                  % Faster
 16.75                  25.51       minutes         29.0%

Buzz Burhans

