Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: GLM with spatial correlation among error terms?

From   Clyde B Schechter <>
To   "" <>
Subject   st: GLM with spatial correlation among error terms?
Date   Tue, 10 Jan 2012 14:47:09 +0000

A colleague and I are trying to develop a research project wherein we will collect information on certain health outcomes in adolescents.  We expect these outcomes to be influenced both by personal attributes, attributes of the school they attend, and attributes of the neighborhood in which they live.  Some of the outcomes are dichotomous, some are counts, and some are "continuous" variables.  We would like to estimate the effects of several of the individual, school, and neighborhood attributes on these outcomes, and we are thinking of a multi-level GLM with individual nested in school X neighborhood.

But, for a number of reasons, we don't want to define "neighborhood" to be a postal zone or census tract or other administrative area.  Rather, for our purposes, the most salient definition of neighborhood is defined as the geographic area within walking distance of the person's home (we have a working definition of walking distance--I don't think the details are relevant here).  There are a number of attributes of the "neighborhood" that we can measure, but they will not comprehensively explain all "neighborhood" effects, so we want to include a random effect at the neighborhood level.  

The problem is that, except for the relatively infrequent circumstance of two people who live at the same address, the "neighborhood" effect is completely confounded with the individual-level error term, so the model will not be identifiable.  Fair enough, and so the individual-level error term will have to serve for both.   But my real concern is this: people who live near each other will have overlapping neighborhoods, often extensively overlapping.  From this, I infer that these individual level error terms cannot reasonably be considered independent.   Rather, there will be correlation among error terms that is some (decreasing, or at least non-increasing) function of the distance between the corresponding homes.

Can anyone point me to how one might estimate such a model?  Stata solutions are preferred, of course, but I'm willing to use other platforms to solve the problem.

Thanks in advance for any advice.

Clyde Schechter
Department of Family & Social Medicine
Albert Einstein College of Medicine
Bronx, NY, USA

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index