Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: AW: gllamm (poisson) execution time


From   "Keith Dear (home)" <keith.dear@anu.edu.au>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: AW: gllamm (poisson) execution time
Date   Fri, 26 Jun 2009 10:49:53 +1000

Thanks to all who have pointed this out, including Roberto G. Gutierrez who was first, but off list. You are not wrong about the speed: 8 hours in gllamm, 4 minutes in xtpoisson!! (in MP4)

But it's disturbing how different the results can be. In this example (suggested by RGG), the variance estimates don't agree to even one figure on what I think are equivalent models, or aren't they?

webuse ships, clear
gen logserv=ln(service)
glo X op_75_79 co_65_69 co_70_74 co_75_79
xtset ship
xtpoisson accident $X, offset(logserv) normal // takes 0.14 seconds on my pc gllamm accident $X, fam(poisson) offset(logserv) i(ship) // takes 5.36 seconds on my pc


~~~~~~~~~ xtpoisson results ~~~~~~~~~
------------------------------------------------------------------------------
accident | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
op_75_79 | .3830105 .118253 3.24 0.001 .1512389 .6147821 co_65_69 | .7093762 .149593 4.74 0.000 .4161794 1.002573 co_70_74 | .8576789 .1693625 5.06 0.000 .5257346 1.189623 co_75_79 | .4992132 .2317164 2.15 0.031 .0450574 .953369 _cons | -6.640989 .2067838 -32.12 0.000 -7.046278 -6.2357
    logserv |   (offset)
-------------+----------------------------------------------------------------
/lnsig2u | -2.352979 .8583287 -2.74 0.006 -4.035272 -.6706858
-------------+----------------------------------------------------------------
sigma_u | .3083593 .1323368 .1329694 .7150928
------------------------------------------------------------------------------
Likelihood-ratio test of sigma_u=0: chibar2(01) = 10.67 Pr>=chibar2 = 0.001


~~~~~~~~~ gllamm results ~~~~~~~~~
------------------------------------------------------------------------------
accident | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
op_75_79 | .3849786 .1182184 3.26 0.001 .1532747 .6166824 co_65_69 | .7058854 .1495483 4.72 0.000 .412776 .9989947 co_70_74 | .847284 .1692169 5.01 0.000 .5156249 1.178943 co_75_79 | .4940048 .2301141 2.15 0.032 .0429894 .9450201 _cons | -6.724426 .140161 -47.98 0.000 -6.999137 -6.449716
    logserv |   (offset)
------------------------------------------------------------------------------ Variances and covariances of random effects
-----------------------------------------------------------------------------
***level 2 (ship)
    var(1): .17662891 (.09378635)
-----------------------------------------------------------------------------

My (quite likely wrong) understanding of these results is that exp(-2.352979)= 0.095085 and .17662891 are estimates of the same variance parameter, which is a bit worrying. I take it the value (.09378635) is the SE of the variance estimate, and it's just a coincidence that it happens to be close to the xtpoisson variance estimate.

Increasing the nip() parameter of gllamm from the default 8 to 19 changes the 0.1766.. value to 0.3529.., which suggests to me that the xtpoisson result is perhaps more reliable (it also doubles the execution time to 10.31 sec). Can someone more expert confirm and/or explain? We know that precise is not the same as accurate, so perhaps invariant is also not to the point.

Thanks
Keith



Jeph Herrin wrote:
If you have a single random effect, you may find -xtpoisson-
is even faster than -xtmepoisson-.

hth,
Jeph


Keith Dear (home) wrote:
Ummm ... no (well, NOW I have).
Except on the uni supercomputer, we only have Stata9, hence ignorance. Time to upgrade!
Many thanks Martin.
Keith

ps
http://www.stata.com/help.cgi?xtmepoisson
http://stata.com/stata10/mixedmodels.html



Martin Weiss wrote:
<>
Have you looked into -xtmepoisson-?




HTH
Martin

-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Keith Dear
(work)
Gesendet: Mittwoch, 24. Juni 2009 08:01
An: statalist@hsphsun2.harvard.edu
Cc: Ainslie Butler
Betreff: st: gllamm (poisson) execution time

We are trying to model daily mortality by poisson regression, over 17 years, by postcode, with postcode as a single random intercept term. In Stata10/MP4 on a linux cluster our models each take 7 or 8 hours to fit, which is too long to be feasible for exploratory analyses.

The full dataset has >14 million rows of data: a row for every day for 1991-2007 for every postcode in Australia (~2200 postcodes), but to get things moving we are starting with smaller geographical regions of only 100 or 200 postcodes. Thus N=17*365*(100 or 200), about a half or one million. Also we are starting with failrly simple models, p=17 fixed-effect parameters just for trend and annual cycles. The models converge ok, eventually, in only a few iterations and with typical condition number about 2.

I found this in the list archives (from Sophia Rabe-Hesketh in 2003):
==> biggest gain is to reduce M, followed by n, p and N
Here we have M=1, n=5 (down from the default of 8), p=17, but N=6E5 or more. There does not seem to be much prospect of reducing any of those, indeed we will need to substantially increase p (for more interesting models) and N (to cover all of Australia at once).

Is there hope? Are there alternatives to gllamm for this? Or are we overlooking something basic here?
Keith


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


--
Dr Keith Dear
Senior Fellow
National Centre for Epidemiology and Population Health
ANU College of Medicine, Biology and Environment
Building 62, cnr Mills and Eggleston Roads
Australian National University
Canberra ACT 0200 Australia
T: 02 6125 4865
F: 02 6125 0740
M: 0424 450 396
W: nceph.anu.edu.au/Staff_Students/staff_pages/dear.php

CRICOS provider #00120C
http://canberragliding.org/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index