This vignette indicates how to perform the analyses described in
Dray and Jombart (2011) of data derived from André-Michel
Guerry’s (1833)
*Essai sur la Statistique Morale de la France*. It illustrates
some classical methods for analysis of multivariate spatial data that
focus *either* on the multivariate aspect or on the spatial one,
as well as some more modern methods that attempt to integrate
geographical and multivariate aspects *simultaneously*.

Several packages are required to run the different analyses and
should be loaded. The `ade4`

package for multivariate
analysis is supplemented by `adegraphics`

(for associated
graphical methods) and `adespatial`

for multivariate spatial
analysis. For further information on this spatial analysis approach, see
`vignette(package="adespatial", "tutorial")`

.

```
library(Guerry) # Guerry's data
library(sp) # management of spatial data
library(ade4) # multivariate analysis
library(adegraphics) # graphical representation
library(spdep) # spatial dependency
library(adespatial) # multivariate spatial analysis
```

Guerry gathered data on crimes, suicide, literacy and other moral
statistics for various départements (i.e., counties) in France. He
provided the first real social data analysis, using graphics and maps to
summarize this georeferenced multivariate dataset. We use the dataset
`gfrance85`

and consider six key quantitative variables
(shown in the table below) for each of the 85 départements of France in
1830 (Corsica, an island and often an outlier, was excluded).

Data are recorded on aligned scales so that **larger
numbers** consistently reflect “morally better”. Thus, four of
the “bad” variables are recorded in the inverse form, as “Population per
…”. With this scaling, it would be expected that all correlations be
\(\ge 0\).

Name | Description |
---|---|

`Crime_pers` |
Population per crime against persons |

`Crime_prop` |
Population per crime against property |

`Literacy` |
Percent of military conscripts who can read and write |

`Donations` |
Donations to the poor |

`Infants` |
Population per illegitimate birth |

`Suicides` |
Population per suicide |

The dataset `gfrance85`

is actually a
`SpatialPolygonsDataFrame`

object created with the
`sp`

package. It contains the polygon boundaries of the map
of France in 1830, as well the variables in the `Guerry`

data
frame. As an ordinary data.frame, it has these components

```
names(gfrance85)
#> [1] "CODE_DEPT" "COUNT" "AVE_ID_GEO" "dept"
#> [5] "Region" "Department" "Crime_pers" "Crime_prop"
#> [9] "Literacy" "Donations" "Infants" "Suicides"
#> [13] "MainCity" "Wealth" "Commerce" "Clergy"
#> [17] "Crime_parents" "Infanticide" "Donation_clergy" "Lottery"
#> [21] "Desertion" "Instruction" "Prostitutes" "Distance"
#> [25] "Area" "Pop1831"
```

To simplify analyses, we extract the components to be used below.

```
data(gfrance85)
df <- data.frame(gfrance85)[, 7:12] # the 6 variables
france.map <- as(gfrance85, "SpatialPolygons") # the map
xy <- coordinates(gfrance85) # spatial coordinates
dep.names <- data.frame(gfrance85)[, 6] # departement names
region.names <- data.frame(gfrance85)[, 5] # region names
col.region <- colors()[c(149, 254, 468, 552, 26)] # colors for region
```

In this section, we focus on classical approaches that consider either the multivariate or the spatial aspect of the data.

Here we consider \(p=6\) variables measured for \(n=85\) individuals (départements of France). As only quantitative variables have been recorded, principal component analysis (PCA, Hotelling 1933) is well adapted. PCA summarizes the data by maximizing simultaneously the variance of the projection of the individuals onto the principal axes and the sum of the squared correlations between the principal component and the variables.

The biplot is simply obtained by

The first two PCA dimensions account for 35.7% and 20% ,respectively, of the total variance.

Correlations between variables and principal components can be represented on a correlation circle. The first axis is negatively correlated to literacy and positively correlated to property crime, suicides and illegitimate births. The second axis is aligned mainly with personal crime and donations to the poor.

Spatial information can be added on the factorial map representing the projections of départements on principal axes by coloring according to colors representing the different regions of France.

```
s.Spatial(france.map, col = col.region[region.names], plabel.cex = 0)
s.class(xy, region.names, col = col.region, add = TRUE, ellipseSize = 0, starSize = 0)
```

For the first axis, the North and East are characterized by negative scores, corresponding to high levels of literacy and high rates of suicides, crimes against property and illegitimate births. The second axis mainly contrasts the West (high donations to the the poor and low levels of crime against persons) to the South.

Spatial autocorrelation statistics, such as Moran (1948) Coefficient (MC) and Geary (1954) Ratio, aim to measure and analyze the degree of dependency among observations in a geographical context (Cliff and Ord 1973).

The first step of spatial autocorrelation analysis is to define a
spatial weighting matrix \(\mathbf{W}=[w_{ij}]\) . In the case of
Guerry’s data, we simply defined a binary neighborhood where two
départements are considered as neighbors if they share a common border.
The spatial weighting matrix is then obtained after row-standardization
(`style = "W"`

):

We can represent this neighborhood on the geographical map:

Once the spatial weights have been defined, the spatial autocorrelation statistics can then be computed. Let us consider the \(n\)-by-1 vector \(\mathbf{x}=\left[ {x_1 \cdots x_n } \right]^\textrm{T}\) containing measurements of a quantitative variable for \(n\) spatial units. The usual formulation for Moran’s coefficient of spatial autocorrelation (Cliff and Ord 1973; Moran 1948) is \[\begin{equation} \label{eq1} MC(\mathbf{x})=\frac{n\sum\nolimits_{\left( 2 \right)} {w_{ij} (x_i -\bar {x})(x_j -\bar {x})} }{\sum\nolimits_{\left( 2 \right)} {w_{ij} } \sum\nolimits_{i=1}^n {(x_i -\bar {x})^2} }\mbox{ where }\sum\nolimits_{\left( 2 \right)} =\sum\limits_{i=1}^n {\sum\limits_{j=1}^n } \mbox{ with }i\ne j. \end{equation}\]

MC can be rewritten using matrix notation: \[\begin{equation} \label{eq2} MC(\mathbf{x})=\frac{n}{\mathbf{1}^\textrm{T}\mathbf{W1}}\frac{\mathbf{z}^\textrm{T}{\mathbf{Wz}}}{\mathbf{z}^\textrm{T}\mathbf{z}}, \end{equation}\] where \(\mathbf{z}=\left ( \mathbf{I}_n-\mathbf{1}_n\mathbf{1}_n^\textrm{T} /n \right )\mathbf{x}\) is the vector of centered values (\(z_i=x_i-\bar{x}\)) and \(\mathbf{1}_n\) is a vector of ones (of length \(n\)).

The significance of the observed value of MC can be tested by a Monte-Carlo procedure, in which locations are permuted to obtain a distribution of MC under the null hypothesis of random distribution. An observed value of MC that is greater than that expected at random indicates the clustering of similar values across space (positive spatial autocorrelation), while a significant negative value of MC indicates that neighboring values are more dissimilar than expected by chance (negative spatial autocorrelation).

We computed MC for the Guerry’s dataset. A positive and significant autocorrelation is identified for each of the six variables. Thus, the values of literacy are the most covariant in adjacent departments, while illegitimate births (Infants) covary least.

```
moran.randtest(df, lw)
#> class: krandtest lightkrandtest
#> Monte-Carlo tests
#> Call: moran.randtest(x = df, listw = lw)
#>
#> Number of tests: 6
#>
#> Adjustment method for multiple comparisons: none
#> Permutation number: 999
#> Test Obs Std.Obs Alter Pvalue
#> 1 Crime_pers 0.4114597 5.797137 greater 0.001
#> 2 Crime_prop 0.2635533 4.160819 greater 0.002
#> 3 Literacy 0.7176053 10.357515 greater 0.001
#> 4 Donations 0.3533613 5.204476 greater 0.001
#> 5 Infants 0.2287241 3.628118 greater 0.002
#> 6 Suicides 0.4016812 6.081180 greater 0.001
```

If the spatial weighting matrix is row-standardized, we can define
the lag vector \(\mathbf{\tilde{z}} =
\mathbf{Wz}\) (i.e., \(\tilde{z}_i =
\sum\limits_{j=1}^n{w_{ij}x_j}\)) composed of the weighted (by
the spatial weighting matrix) averages of the neighboring values. Thus,
we have: \[\begin{equation}
\label{eq3}
MC(\mathbf{x})=\frac{\mathbf{z}^\textrm{T}{\mathbf{\tilde{z}}}}{\mathbf{z}^\textrm{T}\mathbf{z}},
\end{equation}\] since in this case \(\mathbf{1}^\textrm{T}\mathbf{W1}=n\). This
shows clearly that MC measures the autocorrelation by giving an
indication of the intensity of the linear association between the vector
of observed values \(\mathbf{z}\) and
the vector of weighted averages of neighboring values \(\mathbf{\tilde{z}}\). Anselin (1996) proposed
to visualize MC in the form of a bivariate scatterplot of \(\mathbf{\tilde{z}}\) against \(\mathbf{z}\). A linear regression can be
added to this *Moran scatterplot*, with slope equal to MC.

Considering the Literacy variable of Guerry’s data, the Moran scatterplot clearly shows strong autocorrelation. It also shows that the Hautes-Alpes département has a slightly outlying position characterized by a high value of Literacy compared to its neighbors.

The simplest approach considered a two-step procedure where the data
are first summarized with multivariate analysis such as PCA. In a second
step, univariate spatial statistics or mapping techniques are applied to
PCA scores for each axis separately. One can also test for the presence
of spatial autocorrelation for the first few scores of the analysis,
with univariate autocorrelation statistics such as MC. We mapped scores
of the départements for the first two axes of the PCA of Guerry’s data.
Even if PCA maximizes only the variance of these scores, there is also a
clear spatial structure, as the scores are highly autocorrelated. The
map for the first axis corresponds closely to the split between *la
France éclairée* (North-East characterized by an higher level of
Literacy) and *la France obscure*.

```
moran.randtest(pca$li, lw)
#> class: krandtest lightkrandtest
#> Monte-Carlo tests
#> Call: moran.randtest(x = pca$li, listw = lw)
#>
#> Number of tests: 3
#>
#> Adjustment method for multiple comparisons: none
#> Permutation number: 999
#> Test Obs Std.Obs Alter Pvalue
#> 1 Axis1 0.5506343 7.704990 greater 0.001
#> 2 Axis2 0.5614030 8.237265 greater 0.001
#> 3 Axis3 0.1805569 2.701054 greater 0.006
s.value(xy, pca$li[, 1:2], Sp = france.map, pSp.border = "white", symbol = "circle", pgrid.draw = FALSE)
```

Over the last decades, several approaches have been developed to consider both geographical and multivariate information simultaneously. The multivariate aspect is usually treated by techniques of dimensionality reduction similar to PCA. On the other hand, several alternatives have been proposed to integrate the spatial information.

One alternative is to consider a spatial partition of the study area. In this case, the spatial information is coded as a categorical variable, and each category corresponds to a region of the whole study area. For instance, Guerry’s data contained a partition of France into 5 regions.

We used the between-class analysis (BCA, Dolédec and Chessel 1987), to investigate differences between regions. BCA maximizes the variance between groups.

Here, 28.8 % of the total variance (sum of eigenvalues of PCA) corresponds to the between-regions variance (sum of the eigenvalues of BCA).

The main graphical outputs are obtained by the generic
`plot`

function:

The barplot of eigenvalues indicates that two axes should be interpreted. The first two BCA dimensions account for 59 % and 30.2 %, respectively, of the between-regions variance.

The coefficients used to construct the linear combinations of variables are represented:

The first axis opposed literacy to property crime, suicides and illegitimate births. The second axis is mainly aligned with personal crime and donations to the poor.

Projections of départements on the BCA axes can be represented on the factorial map:

```
s.label(bet$ls, as.character(dep.names), ppoint.cex = 0, plabel.optim = TRUE, plabel.col = col.region[region.names], plabel.cex = 0.5)
s.class(bet$ls, fac = region.names, col = col.region, ellipse = 0, add = TRUE)
```

The scores can be mapped to show the spatial aspects:

`s.value(xy, bet$ls, symbol = "circle", Sp = france.map, pSp.col = col.region[region.names], pSp.border = "transparent")`

The results are very close to those obtained by PCA: the first axis
contrasted the North and the East (*la France éclairée*) to the
other regions while the South is separated from the other regions by the
second axis. The high variability of the region Centre is also
noticeable. In contrast, the South is very homogeneous.

Principal component analysis with respect to the instrumental variables (PCAIV, Rao 1964), and related methods, have been often used in community ecology to identify spatial relationships. The spatial information is introduced in the form of spatial predictors and the analysis maximized the “spatial variance” (i.e., the variance explained by spatial predictors). Note that BCA can also be considered as a particular case of PCAIV, where the explanatory variables are dummy variables indicating group membership.

Student (1914) proposed to express observed values in time series as a polynomial function of time, and mentioned that this could be done for spatial data as well. Borcard, Legendre, and Drapeau (1992) extended this approach to the spatial and multivariate case by introducing polynomial functions of geographic coordinates as predictors in PCAIV. We call this approach PCAIV-POLY.

The centroids of départements of France were used to construct a second-degree orthogonal polynomial.

```
poly.xy <- orthobasis.poly(xy, degree = 2)
s.value(xy, poly.xy, Sp = france.map, plegend.drawKey = FALSE)
```

PCAIV is then performed using the `pcaiv`

function:

Here, 32.4 % of the total variance (sum of eigenvalues of PCA) is explained by the second-degree polynomial (sum of eigenvalues of PCAIV). The first two dimensions account for 51.4 % and 35.2 %, respectively, of the explained variance.

```
sum(pcaiv.xy$eig)/sum(pca$eig) * 100
#> [1] 32.36025
pcaiv.xy$eig/sum(pcaiv.xy$eig) * 100
#> [1] 51.423264 35.152362 5.954475 4.799075 2.670825
```

The outputs of PCAIV-POLY (coefficients of variables, maps of
départements scores, etc.) are very similar to those obtained by BCA.
They can be represented easily by the generic `plot`

function:

An alternative way to build spatial predictors is by the
diagonalization of the spatial weighting matrix **W**.
Moran’s eigenvector maps (MEM, Dray, Legendre, and Peres-Neto 2006) are
the \(n-1\) eigenvectors of the
doubly-centered matrix **W**. They are orthogonal vectors
with a unit norm maximizing MC (Griffith 1996). MEM associated with high
positive (or negative) eigenvalues have high positive (or negative)
autocorrelation. MEM associated with eigenvalues with small absolute
values correspond to low spatial autocorrelation, and are not suitable
for defining spatial structures.

We used the spatial weighting matrix defined above to construct MEM. The first ten MEM, corresponding to the highest levels of spatial autocorrelation, have been mapped:

We introduced the first ten MEM as spatial explanatory variables in PCAIV. We call this approach PCAIV-MEM.

Here, 44.1 % of the total variance (sum of eigenvalues of PCA) is explained by the first ten MEM (sum of eigenvalues of PCAIV). The first two dimensions account for 54.9 % and 26.3 %, respectively, of the explained variance.

```
sum(pcaiv.mem$eig)/sum(pca$eig) * 100
#> [1] 44.11597
pcaiv.mem$eig/sum(pcaiv.mem$eig) * 100
#> [1] 54.928640 26.300082 9.042300 4.574408 3.532962 1.621607
```

The outputs of PCAIV-MEM (coefficients of variables, maps of
départements scores, etc.) are very similar to those obtained by BCA.
They can be represented easily by the generic `plot`

function:

The MEM framework introduced the spatial information into multivariate analysis through the eigendecomposition of the spatial weighting matrix. Usually, we consider only a part of the information contained in this matrix because only a subset of MEM are used as regressors in PCAIV. In this section, we focus on multivariate methods that consider the spatial weighting matrix under its original form.

Wartenberg (1985) was the first to develop a multivariate analysis based on MC. His work considered only normed and centered variables (i.e., normed PCA) for the multivariate part and a binary symmetric connectivity matrix for the spatial aspect. Dray, Saïd, and Débias (2008) generalized Wartenberg’s method by introducing a row-standardized spatial weighting matrix in the analysis of a statistical triplet. This approach is very general and allows us to define spatially-constrained versions of various methods (corresponding to different triplets) such as correspondence analysis or multiple correspondence analysis. MULTISPATI finds coefficients to obtain a linear combination of variables that maximizes a compromise between the classical multivariate analysis and a generalized version of Moran’s coefficient.

The main outputs of MULTISPATI can be represented easily by the
generic `plot`

function:

The barplot of eigenvalues suggests two main spatial structures.
Eigenvalues of MULTISPATI are the product between the variance and the
spatial autocorrelation of the scores, while PCA maximizes only the
variance. The differences between the two methods are computed by the
`summary`

function:

```
summary(ms)
#>
#> Multivariate Spatial Analysis
#> Call: multispati(dudi = pca, listw = lw, scannf = FALSE)
#>
#> Scores from the initial duality diagram:
#> var cum ratio moran
#> RS1 2.140470 2.140470 0.3567451 0.5506343
#> RS2 1.200820 3.341290 0.5568817 0.5614030
#> RS3 1.102047 4.443337 0.7405562 0.1805569
#>
#> Multispati eigenvalues decomposition:
#> eig var moran
#> CS1 1.2858683 2.017171 0.6374612
#> CS2 0.6939887 1.176551 0.5898499
```

Hence, there is a loss of variance compared to PCA (2.14 versus 2.017 for axis 1; 1.201 versus 1.177 for axis 2) but a gain of spatial autocorrelation (0.551 versus 0.637 for axis 1; 0.561 versus 0.59 for axis 2).

Coefficients of variables allow to interpret the structures:

The first axis opposes literacy to property crime, suicides and illegitimate births. The second axis is aligned mainly with personal crime and donations to the poor. The maps of the scores show that the spatial structures are very close to those identified by PCA. The similarity of results between PCA and its spatially optimized version confirm that the main structures of Guerry’s data are spatial.

Spatial autocorrelation can be seen as the link between one variable and the lagged vector. This interpretation is used to construct the Moran scatterplot and can be extended to the multivariate case in MULTISPATI by analyzing the link between scores and lagged scores:

```
s.match(ms$li, ms$ls, plabel.cex = 0)
s.match(ms$li[c(10, 41, 27), ], ms$ls[c(10, 41, 27), ], label = dep.names[c(10,
41, 27)], plabel.cex = 0.8, add = TRUE)
```

Each département can be represented on the factorial map by an arrow (the bottom corresponds to its score, the head corresponds to its lagged score. A short arrow reveals a local spatial similarity (between one plot and its neighbors) while a long arrow reveals a spatial discrepancy. This viewpoint can be interpreted as a multivariate extension of the local index of spatial association (Anselin 1995). For instance: * Aude has a very small arrow, indicating that this département is very similar to its neighbors. * Haute-Loire has a long horizontal arrow which reflects its high values for the variables Infants (31017), Suicides (163241) and Crime_prop (18043) compared to the average values over its neighbors (27032.4, 60097.8 and 10540.8 for these three variables). * Finistère corresponds to an arrow with a long vertical length which is due to its high values compared to its neighbors for Donations (23945 versus 12563) and Crime_pers (29872 versus 25962).

The link between the scores and the lagged scores (averages of neighbors weighted by the spatial connection matrix) can be mapped in the geographical space. For the first two axes, we have:

Even if the methods presented are quite different in their theoretical and practical viewpoints, their applications to Guerry’s dataset yield very similar results. We provided a quantitative measure of this similarity by computing Procrustes statistics [Peres-Neto and Jackson (2001);SD161] between the scores of the départements onto the first two axes for the different analyses. All the values of the statistic are very high and significant; this confirms the high concordance between the outputs of the different methods.

```
mat <- matrix(NA, 4, 4)
mat.names <- c("PCA", "BCA", "PCAIV-POLY", "PCAIV-MEM", "MULTISPATI")
colnames(mat) <- mat.names[-5]
rownames(mat) <- mat.names[-1]
mat[1, 1] <- procuste.randtest(pca$li[, 1:2], bet$ls[, 1:2])$obs
mat[2, 1] <- procuste.randtest(pca$li[, 1:2], pcaiv.xy$ls[, 1:2])$obs
mat[3, 1] <- procuste.randtest(pca$li[, 1:2], pcaiv.mem$ls[, 1:2])$obs
mat[4, 1] <- procuste.randtest(pca$li[, 1:2], ms$li[, 1:2])$obs
mat[2, 2] <- procuste.randtest(bet$ls[, 1:2], pcaiv.xy$ls[, 1:2])$obs
mat[3, 2] <- procuste.randtest(bet$ls[, 1:2], pcaiv.mem$ls[, 1:2])$obs
mat[4, 2] <- procuste.randtest(bet$ls[, 1:2], ms$li[, 1:2])$obs
mat[3, 3] <- procuste.randtest(pcaiv.xy$ls[, 1:2], pcaiv.mem$ls[, 1:2])$obs
mat[4, 3] <- procuste.randtest(pcaiv.xy$ls[, 1:2], ms$li[, 1:2])$obs
mat[4, 4] <- procuste.randtest(pcaiv.mem$ls[, 1:2], ms$li[, 1:2])$obs
mat
#> PCA BCA PCAIV-POLY PCAIV-MEM
#> BCA 0.9788594 NA NA NA
#> PCAIV-POLY 0.9791939 0.9897463 NA NA
#> PCAIV-MEM 0.9885944 0.9936217 0.9954116 NA
#> MULTISPATI 0.9868799 0.9954034 0.9951133 0.9986471
```

Anselin, L. 1995. “Local Indicators of Spatial
Association.” *Geographical Analysis* 27: 93–115.

———. 1996. “The Moran Scatterplot as an
ESDA Tool to Assess Local Instability in Spatial
Association.” In *Spatial Analytical Perspectives on GIS*,
edited by M. M. Fischer, H. J. Scholten, and D. Unwin, 111–25. London:
Taylor; Francis.

Borcard, D., P. Legendre, and P. Drapeau. 1992. “Partialling Out
the Spatial Component of Ecological Variation.” *Ecology*
73: 1045–55.

Cliff, A. D., and J. K. Ord. 1973. *Spatial Autocorrelation*.
London: Pion.

Dolédec, S., and D. Chessel. 1987. “Rythmes Saisonniers Et
Composantes Stationnelles En Milieu Aquatique I-
Description d’un Plan d’observations Complet Par Projection
de Variables.” *Acta Oecologica - Oecologia Generalis* 8
(3): 403–26.

Dray, S., and T. Jombart. 2011. “Revisiting Guerry’s Data:
Introducing Spatial Constraints in Multivariate Analysis.”
*Annals of Applied Statistics* 5 (4): 2278–99.

Dray, S., P. Legendre, and P. R. Peres-Neto. 2006. “Spatial
Modeling: A Comprehensive Framework for Principal Coordinate Analysis of
Neighbor Matrices (PCNM).” *Ecological
Modelling* 196: 483–93.

Dray, S., S. Saïd, and F. Débias. 2008. “Spatial Ordination of
Vegetation Data Using a Generalization of Wartenberg’s
Multivariate Spatial Correlation.” *Journal of Vegetation
Science* 19: 45–56.

Geary, R. C. 1954. “The Contiguity Ratio and Statistical
Mapping.” *The Incorporated Statistician* 5 (3): 115–45.

Griffith, D. A. 1996. “Spatial Autocorrelation and Eigenfunctions
of the Geographic Weights Matrix Accompanying Geo-Referenced
Data.” *Canadian Geographer* 40 (4): 351–67.

Guérry, A. M. 1833. *Essai Sur La Statistique Morale de La
France*. Paris: Crochard.

Hotelling, H. 1933. “Analysis of a Complex of Statistical
Variables into Principal Components.” *Journal of Educational
Psychology* 24: 417–41.

Moran, P. A. P. 1948. “The Interpretation of Statistical
Maps.” *Journal of the Royal Statistical Society Series
B-Methodological* 10: 243–51.

Peres-Neto, P. R., and D. A. Jackson. 2001. “How Well Do
Multivariate Data Sets Match? The Advantages of a
Procrustean Superimposition Approach over the
Mantel Test.” *Oecologia* 129: 169–78.

Rao, C. R. 1964. “The Use and Interpretation of Principal
Component Analysis in Applied Research.” *Sankhya A* 26:
329–59.

Student, W. S. 1914. “The Elimination of Spurious Correlation Due
to Position in Time or Space.” *Biometrika* 10: 179–80.

Wartenberg, D. 1985. “Multivariate Spatial Correlation: A Method
for Exploratory Geographical Analysis.” *Geographical
Analysis* 17 (4): 263–83.