# RE: st: autocorrelation in Poisson regression, follow-up

 From "Kieran McCaul" To Subject RE: st: autocorrelation in Poisson regression, follow-up Date Wed, 6 Aug 2008 10:49:52 +0800

```Stas has suggested a couple of models that might indicate
autocorrelation but, as he has commented, it's difficult to see how they
would occur.

Let's imagine that someone is modelling disease counts in a population
over time and they have arrived at a Poisson model that incorporates a
lagged variable - the count or the rate in the previous year.

This would indicate two things to me:
1. Obviously, the cohort is carrying with it some information that is
dependent on the rate in the previous year and is predictive of the rate
in the subsequent year.
2. The Poisson model is not appropriate.

Suppose in this population, only a small proportion are at risk of
getting the disease and the remainder are not at risk but these cannot
be identified.  If a high disease count in one year removed a
significant proportion of those susceptible from the cohort then, in the
next year, a lower count and hence a lower rate would be observed.

This would look like autocorrelation, but it's not.  It is caused by
extreme heterogeneity in risk within the population being followed.  The
Poisson model is wrong.  The Poisson model assumes that everyone is at
risk and that all the person-time accumulated by the cohort as it moves
through time is all at-risk person-time.

In the absence of unexplained heterogeneity of risk, if you are
following a cohort over time and you split the total person-time at risk
into the person-time observed in the first year, in the second year,
etc, then these are all independent samples of the person-time at risk.
If disease counts are being generated by a Poisson process, then the
counts in each year will depend only on the characteristics of the
person-time accrued in each year.  They are all independent, there is no
autocorrelation.

So if the process you are modelling is truly Poisson, you shouldn't have

One thing that does worry me, however, is that in your follow-up email
you said you were modelling a count that was "the number of groups
founded".  What exactly does that mean?

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Stas
Kolenikov
Sent: Wednesday, 6 August 2008 8:19 AM
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: autocorrelation in Poisson regression, follow-up

Ah, that was a teaser. The three ways I was thinking of were:

1. correlation of the Poission counts themselves: poisson y x L.y -- I
cannot think much of any actual process that could generate that sort
of dependence

2. correlation of the Poisson rates: y|x is Poisson with rate lambda =
function of x and L.lambda. That is sort of weird, too, as it assumes
the Poisson rate jumps around exactly at midnight. Or on the 1st of
the month, or on the 1st of the year. Again, it should be a pretty
strange underlying process to have that sort of dependence.

3. Some sort of dependence in deviance residuals -- but those are
derived quantities rather than fundamentals, unlike the standard
Gaussian ARMA processes. You can sorta proceed in a two step way -- to
run the base regression -glm y x, family(poisson) link(log)- get the
residuals out -predict dev, dev-, and run -poisson y x L.dev- to see
if the coefficient of the latter is zero or not. But again that's a
stupid model.

In general, I hate when referees throw something like that without any
indication of how they want you to proceed. My referee reports are
always a month late, but they have a page of references for two pages
of comments. Can you get along by saying something like "There is no
established method for checking autocorrelations in Poisson
regression, since Poisson processes are intrinsically continuous, and
any discretization is arbitrary -- hence testing for autocorrelations
would require making arbitrary decisions about the time scales at
which the correlations will be present. I will however very much
appreciate it if the reviewer could provide a reference if there is
anything easily available."

On Tue, Aug 5, 2008 at 10:14 AM, Antonio Silva <asilva100@live.com>
wrote:
>
> Stas (and others):
>
>
> Thanks for responding to my question. To be honest, I am not sure how
to define autocorrelation in this context. I sent out an article for
review. The dependent variable is a yearly count variable--number of
groups founded. The range is 0-5. I used a simply Poisson regression
model with several independent variables. One of the reviewers said the
analysis was flawed because I "did not test for autocorrelation on the
dependent variable." Unfortunately, the reviewer did not give me any
clue as to how to proceed. So I am kind of at a loss here. Essentially,
I think the reviewer wanted to make sure the yearly counts were not
related to each other in any meaningful way.
>
>
> Any further thoughts are appreciated, and I wish I could tell you more
>

You need a better reviewer :))

--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```