The spdep
package has always been careful about
disconnected graphs, especially where the disconnected observations are
graph nodes with no neighbours, that is no incoming or outgoing edges.
In nb
neighbour objects, they are encoded as integer
vectors of length 1 containing integer 0
, which is an
invalid index on \([1, N]\), where
\(N\) is the observation count.
Functions taking neighbour objects as arguments use the
zero.policy
argument to guide how to handle no-neighbour
observations.
spdep
has also had n.comp.nb
to find the
number of disjoint connected subgraphs in an nb
object,
contributed by Nicholas Lewin-Koh in 2001 and using depth-first search
for symmetric neighbours, showing in addition which observations belong
to which subgraph. Obviously, no-neighbour observations are singleton
graph nodes, but subgraphs are also troubling for spatial analysis,
because there is no connection between the spatial processes in those
subgraphs. The ripples in one pond cannot cross into a separate
pond if they are not connected.
From spdep
1.3-1, steps began to raise awareness of the
possibility that neighbour objects might be created that are
disconnected in some way, mostly through warnings, and through the
computation of subgraph measures by default. This vignette is intended
to provide some background to these steps.
From the start, nb
objects have recorded no-neighbour
observations as an integer vector of unit length and value
0
, where neighbours are recorded as ID values between
1
and N
, where N
is the
observation count. print
and summary
methods
have always reported the presence of no-neighbour observations, and
listed their IDs (or region.id
values). If an
nb
object contains no-neighbour observations, the user has
to decide whether to drop those observations, or if retained, what value
to give its weights. The zero.policy
argument uses zero as
the value if TRUE, but if FALSE causes nb2listw
to fail.
The value of zero.policy
in a call to functions like
nb2listw
, subset.listw
or
mat2listw
creating listw
objects representing
sparse spatial weights matrices is added to the created object as an
attribute, and used subsequently to pass through that choice to other
functions. For example, moran.test
takes the value of this
attribute as default for its zero.policy
argument:
library(spdep)
args(moran.test)
## function (x, listw, randomisation = TRUE, zero.policy = attr(listw,
## "zero.policy"), alternative = "greater", rank = FALSE, na.action = na.fail,
## spChk = NULL, adjust.n = TRUE, drop.EI2 = FALSE)
## NULL
If observation \(i\) has no neighbours, its weights sum \(\sum_{j=1}^N w_{ij} = 0\), as \(w_{ij} = 0, \forall j\) (see discussion in Bivand and Portnov (2004)). Its eigenvalue will also be zero, with consequences for analytical inference:
eigen(0)$values
## [1] 0
The adjust.n
argument to measures of spatial
autocorrelation is by default TRUE, and subtracts the count of singleton
nodes from \(N\) in an attempt to
acknowledge the reduction in information available.
This discussion will address problems arising when analysing areal/lattice data, and neighbours are defined as polygon features with contiguous boundaries. One way in which no-neighbour observations may occur is when they are islands. This is clearly the case in Freni-Sterrantino, Ventrucci, and Rue (2018), where Capraia and Giglio Isles are singleton nodes. Here we take Westminster constituencies for Wales used in the July 2024 UK general election. If GDAL is at least version 3.7.0, the driver supports compressed GeoPackage files, if not they must be decompressed first.
(GDAL37 <- as.numeric_version(unname(sf_extSoftVersion()["GDAL"])) >= "3.7.0")
## [1] TRUE
The boundaries are taken from the Ordnance Survey Boundary-Line site, https://osdatahub.os.uk/downloads/open/BoundaryLine, choosing the 2024 Westminster constituencies (https://www.os.uk/opendata/licence), simplified using a tolerance of 50m to reduce object size, and merged with selected voting outcomes for constituencies in Great Britain https://electionresults.parliament.uk/countries/1, (https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/). Here, the subset for Wales is useful as we will see:
file <- "etc/shapes/GB_2024_Wales_50m.gpkg.zip"
zipfile <- system.file(file, package="spdep")
if (GDAL37) {
w50m <- st_read(zipfile)
} else {
td <- tempdir()
bn <- sub(".zip", "", basename(file), fixed=TRUE)
target <- unzip(zipfile, files=bn, exdir=td)
w50m <- st_read(target)
}
## Reading layer `GB_2024_Wales_50m' from data source
## `/tmp/RtmpRJBEOV/Rinst3afc0726ca4663/spdep/etc/shapes/GB_2024_Wales_50m.gpkg.zip'
## using driver `GPKG'
## Simple feature collection with 32 features and 19 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 146597.1 ymin: 164536.5 xmax: 355287 ymax: 395993.5
## Projected CRS: OSGB36 / British National Grid
(w50m |> poly2nb(row.names=as.character(w50m$Constituency)) -> nb_W_50m)
## Warning in poly2nb(w50m, row.names = as.character(w50m$Constituency)): some observations have no neighbours;
## if this seems unexpected, try increasing the snap argument.
## Warning in poly2nb(w50m, row.names = as.character(w50m$Constituency)): neighbour object has 2 sub-graphs;
## if this sub-graph count seems unexpected, try increasing the snap argument.
## Neighbour list object:
## Number of regions: 32
## Number of nonzero links: 136
## Percentage nonzero weights: 13.28125
## Average number of links: 4.25
## 1 region with no links:
## Ynys Môn
## 2 disjoint connected subgraphs
The two subgraphs are the singleton Ynys Môn and all the other 31 constituencies:
attr(nb_W_50m, "ncomp")$comp.id |>table() |> table()
##
## 1 31
## 1 1
The left map shows that Ynys Môn can be shown selecting by name, as a
black border, and by the zero cardinality of its neighbour set, using
card
, filling the polygon. The right map shows the location
of the island, known in English as Anglesey, north-west of the Welsh
mainland, and with no neighbour links:
ynys_mon <- w50m$Constituency == "Ynys Môn"
pts <- st_point_on_surface(st_geometry(w50m))
opar <- par(mfrow=c(1, 2))
plot(st_geometry(w50m), border="grey75")
plot(st_geometry(w50m)[ynys_mon], add=TRUE)
plot(st_geometry(w50m)[card(nb_W_50m) == 0L], add=TRUE, border="transparent", col="wheat1")
plot(st_geometry(w50m), border="grey75")
plot(nb_W_50m, pts, add=TRUE)