Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: "pooled" xtmepoisson with unconstrained error variance

 From Steve Samuels To statalist@hsphsun2.harvard.edu Subject Re: st: "pooled" xtmepoisson with unconstrained error variance Date Sun, 25 Nov 2012 17:35:25 -0600

```Bobby Gutierrez showed how to model unequal variances with -xtmixed- in
www.stata.com/meeting/fnasug08/gutierrez.pdf. I don't have time to try
this, but at first glance it looks like his trick for doing this (pp
14-16) will work with -xtmelogit-.

Steve

On Nov 25, 2012, at 4:59 PM, Rebecca Pope wrote:

Hello,
I need to estimate a Poisson model for two groups with unequal
variance where the data is comes from observations on patients over
time nested within clinics (i.e. level 1 is time (measurement
occasion), level 2 is the patient, and level 3 is the clinic). I am
using Stata 12.1. I think -xtmepoisson- is a natural choice for the
analysis, except that for the time being I'm stuck estimating separate
equations for each group.

In the interest of fixing terms, by "pooled" I mean that I've taken a
separate equation for each group and written them as one "master"
equation.

Bill Gould discusses something similar to my problem in the linear
regression context at:
http://www.stata.com/support/faqs/statistics/pooling-data-and-chow-tests/.
aweights and -xtglm- are discussed, but neither is applicable in this
context so I'm turning to the Statalist for assistance.

Put as concisely as I can my questions are:
1. Is unequal variance between groups as much of a problem in Poisson
models as in linear regression? (Clearly I think "yes" or I wouldn't
be posting, but I'd like to verify with more expert folks than me).
2a. Can I control for this in a "pooled" multilevel Poisson model (in Stata)?
2b. How do I control for unequal variance in a pooled multilevel
Poisson model in Stata?

Here is an example that resembles my problem. Assume for the sake of
argument that a group*age interaction is somehow meaningful and
interesting in this context.

*** begin example ***
use  http://www.stata-press.com/data/r12/epilepsy
/* create artificial groups, 1 for odd ID number, 0 for even */
gen foo = ceil((subject/2)-int(subject/2))
/* demonstrate baseline differences in variances by group */
by subject, sort: gen first=_n==1
sdtest seizures if first, by(foo)  /* significant at alpha=0.10, in
actual data, p < 0.001 */
/* -xtmepoisson- model from manual for each group (1) */
by foo, sort : xtmepoisson seizures treat lage lbas lbas_trt v4, || subject:
/* -xtmepoisson- with interactions for covariate of interest (2) */
xtmepoisson seizures treat lage##i.foo lbas lbas_trt v4, || subject:
/* -xtmepoisson- fully interacted (3) (will switch to Laplace here
by default) */
gen cons0=foo==0
xtmepoisson seizures cons0 i.foo##i.treat c.lage##i.foo c.lbas##i.foo
c.lbas_trt##i.foo c.v4##i.foo, nocons || subject: R.foo

*** end example ***

(3) seems to me to be clearly preferred to (2) because it recovers all
FEs from (1) though the estimates are not exact. I tried Laplace in
both and it didn't make a difference, which from the manual should
have been expected. Am I on the right track with this progression? How
do I accommodate the fact that the variance in number of seizures
differs by "foo"?

In case the following is relevant to anyone's recommendations:
- The example above only has 59 patients; I have several of thousand.
- I do not have an equal number of patients in each group; there is
about a 3:1 ratio of 0s to 1s for my comorbidity indicator.
- The data is observational. It comes from medical records review.
- There are about 30 coefficients to be estimated before any interactions/REs.
- There is no randomly assigned treatment, just a set of 3 covariates
that I am interested in testing whether they are jointly different
between the two groups.
- The example data doesn't have a natural level 3 variable, but I have
a random intercept for the clinic also.

Related econometric references are welcomed just as much as Stata tips
with the terms "pooling Poisson multilevel mixed effects" and various
combinations thereof and haven't found anything that addresses the use
of pooled data in a Poisson regression let alone the issue of unequal
variances.

* I'm not sure if the use of R.foo is correct for the RE in model (3).
It is my best guess for now & I intend to do more reading on that
later.

Thanks,
Rebecca
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```