Introduction

The spdep package has always been careful about disconnected graphs, especially where the disconnected observations are graph nodes with no neighbours, that is no incoming or outgoing edges. In nb neighbour objects, they are encoded as integer vectors of length 1 containing integer 0, which is an invalid index on \([1, N]\), where \(N\) is the observation count. Functions taking neighbour objects as arguments use the zero.policy argument to guide how to handle no-neighbour observations.

spdep has also had n.comp.nb to find the number of disjoint connected subgraphs in an nb object, contributed by Nicholas Lewin-Koh in 2001 and using depth-first search for symmetric neighbours, showing in addition which observations belong to which subgraph. Obviously, no-neighbour observations are singleton graph nodes, but subgraphs are also troubling for spatial analysis, because there is no connection between the spatial processes in those subgraphs. The ripples in one pond cannot cross into a separate pond if they are not connected.

From spdep 1.3-1, steps began to raise awareness of the possibility that neighbour objects might be created that are disconnected in some way, mostly through warnings, and through the computation of subgraph measures by default. This vignette is intended to provide some background to these steps.

No-neighbour observations

From the start, nb objects have recorded no-neighbour observations as an integer vector of unit length and value 0, where neighbours are recorded as ID values between 1 and N, where N is the observation count. print and summary methods have always reported the presence of no-neighbour observations, and listed their IDs (or region.id values). If an nb object contains no-neighbour observations, the user has to decide whether to drop those observations, or if retained, what value to give its weights. The zero.policy argument uses zero as the value if TRUE, but if FALSE causes nb2listw to fail. The value of zero.policy in a call to functions like nb2listw, subset.listw or mat2listw creating listw objects representing sparse spatial weights matrices is added to the created object as an attribute, and used subsequently to pass through that choice to other functions. For example, moran.test takes the value of this attribute as default for its zero.policy argument:

library(spdep)
args(moran.test)
## function (x, listw, randomisation = TRUE, zero.policy = attr(listw, 
##     "zero.policy"), alternative = "greater", rank = FALSE, na.action = na.fail, 
##     spChk = NULL, adjust.n = TRUE, drop.EI2 = FALSE) 
## NULL

If observation \(i\) has no neighbours, its weights sum \(\sum_{j=1}^N w_{ij} = 0\), as \(w_{ij} = 0, \forall j\) (see discussion in Bivand and Portnov (2004)). Its eigenvalue will also be zero, with consequences for analytical inference:

eigen(0)$values
## [1] 0

The adjust.n argument to measures of spatial autocorrelation is by default TRUE, and subtracts the count of singleton nodes from \(N\) in an attempt to acknowledge the reduction in information available.

This discussion will address problems arising when analysing areal/lattice data, and neighbours are defined as polygon features with contiguous boundaries. One way in which no-neighbour observations may occur is when they are islands. This is clearly the case in Freni-Sterrantino, Ventrucci, and Rue (2018), where Capraia and Giglio Isles are singleton nodes. Here we take Westminster constituencies for Wales used in the July 2024 UK general election. If GDAL is at least version 3.7.0, the driver supports compressed GeoPackage files, if not they must be decompressed first.

(GDAL37 <- as.numeric_version(unname(sf_extSoftVersion()["GDAL"])) >= "3.7.0")
## [1] TRUE

The boundaries are taken from the Ordnance Survey Boundary-Line site, https://osdatahub.os.uk/downloads/open/BoundaryLine, choosing the 2024 Westminster constituencies (https://www.os.uk/opendata/licence), simplified using a tolerance of 50m to reduce object size, and merged with selected voting outcomes for constituencies in Great Britain https://electionresults.parliament.uk/countries/1, (https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/). Here, the subset for Wales is useful as we will see:

file <- "etc/shapes/GB_2024_Wales_50m.gpkg.zip"
zipfile <- system.file(file, package="spdep")
if (GDAL37) {
    w50m <- st_read(zipfile)
} else {
    td <- tempdir()
    bn <- sub(".zip", "", basename(file), fixed=TRUE)
    target <- unzip(zipfile, files=bn, exdir=td)
    w50m <- st_read(target)
}
## Reading layer `GB_2024_Wales_50m' from data source 
##   `/tmp/RtmpRJBEOV/Rinst3afc0726ca4663/spdep/etc/shapes/GB_2024_Wales_50m.gpkg.zip' 
##   using driver `GPKG'
## Simple feature collection with 32 features and 19 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 146597.1 ymin: 164536.5 xmax: 355287 ymax: 395993.5
## Projected CRS: OSGB36 / British National Grid
(w50m |> poly2nb(row.names=as.character(w50m$Constituency)) -> nb_W_50m)
## Warning in poly2nb(w50m, row.names = as.character(w50m$Constituency)): some observations have no neighbours;
## if this seems unexpected, try increasing the snap argument.
## Warning in poly2nb(w50m, row.names = as.character(w50m$Constituency)): neighbour object has 2 sub-graphs;
## if this sub-graph count seems unexpected, try increasing the snap argument.
## Neighbour list object:
## Number of regions: 32 
## Number of nonzero links: 136 
## Percentage nonzero weights: 13.28125 
## Average number of links: 4.25 
## 1 region with no links:
## Ynys Môn
## 2 disjoint connected subgraphs

The two subgraphs are the singleton Ynys Môn and all the other 31 constituencies:

attr(nb_W_50m, "ncomp")$comp.id |>table() |> table()
## 
##  1 31 
##  1  1

The left map shows that Ynys Môn can be shown selecting by name, as a black border, and by the zero cardinality of its neighbour set, using card, filling the polygon. The right map shows the location of the island, known in English as Anglesey, north-west of the Welsh mainland, and with no neighbour links:

ynys_mon <- w50m$Constituency == "Ynys Môn"
pts <- st_point_on_surface(st_geometry(w50m))
opar <- par(mfrow=c(1, 2))
plot(st_geometry(w50m), border="grey75")
plot(st_geometry(w50m)[ynys_mon], add=TRUE)
plot(st_geometry(w50m)[card(nb_W_50m) == 0L], add=TRUE, border="transparent", col="wheat1")
plot(st_geometry(w50m), border="grey75")
plot(nb_W_50m, pts, add=TRUE)