Geographically weighted regression in Stata
|
Speaker |
Mark S. Pearce, University of Newcastle upon Tyne
|
Geographically weighted regression is a method for exploring spatial
nonstationarity. Spatial nonstationarity being a condition in which a simple
"global" regression model cannot adequately explain the
relationships between some sets of variables over a geographical area.
Instead, the nature of the model should alter over space to reflect the
structure within the data. For example does the risk of disease in relation
to a risk factor remain constant across a geographical area, or is the
relationship stronger at certain points within the area.
Brunsdon et al. (1996) developed geographically weighted regression, which
attempts to capture this spatial variation by calibrating a multiple
regression model which allows different relationships between variables to
exist at different points in space.
The basic idea of geographically weighted regression is that a regression
model is fitted at each point in the data, weighting all observations by a
function of distance from that point. This corresponds to the idea that
observations sampled near to the observation where the regression is centred
have more influence on the resulting regression parameters at that point
than observations further away. This then produces a set of parameter
estimates at each point in the defined geographical area. These parameter
estimates can then be mapped using GIS software to identify where the
relationships between variables vary, providing a useful form of exploratory
analysis. Using Monte Carlo methods 2 hypothesis tests can be carried out:
- that the data may be described by a global model rather than a
nonstationary one.
- whether individual regression coefficients are stable over geographic
space.
I will present how this method can be carried out in Stata using the ado
files gwr and gwrgrid which both apply geographically weighted
regression to a dataset containing geographical reference points. The only
difference between the two ado files being that gwrgrid places a grid
over the geographical area and carries out regressions centred at each grid
centroid, whereas gwr carries out regressions centred at each point
in the data.
The code in these ado files is based on the paper by Brunsdon et al., and a
FORTRAN program written by Brunsdon et al., and has been extended to any
form of generalized linear model by relying heavily on the existing glm
function in Stata.
The technique and programs, and the options included with them, will be
demonstrated on the example given by Brunsdon et al. — a ward-level
dataset from the 1991 UK census relating car ownership rates to social class
and male unemployment in the county of Tyne & Wear in north-east
England.
Reference
-
Brunsdon, C., A. S. Fotheringham, and M. E. Charlton. 1996.
- Geographically weighted regression: A method for exploring spatial nonstationarity.
Geographical Analysis 28: 281–298.
|