Morphoscape

library(Morphoscape)

The following provides a guide on the workflow of performing an adaptive landscape analysis, from the philosophies of morphospaces, and types of data required, to the details of how to use the functions in this “Morphoscape” package. This guide will cover the following topics:

1. Defining a Morphospace
  - Phenotype
  - Performance data: Specimens or Warps
2. Using Morphoscape
  - Creating Performance Surfaces
  - Calculate Landscapes

Defining a morphospace

Phenotype

Adaptive landscape analyses require two types of data to be married together: phenotypic data and performance data. Phenotype data can take many forms, and the only requirement of phenotype data to the construction of adaptive landscapes is the definition of a morphospace. The simplest type of morphospace can be a 2D plot with the axes defined by two numeric measurements of phenotype, such as the length and width of a skull, though any quantification of phenotype is valid.

Geometric morphometrics has become the quintessential method for quantifying detailed shape variation in bony structures, and the final outcome in many studies is the visualization of a PCA of shape variation - which is a morphospace. However, how one defines the morphospace can drastically impact the outcomes of these adaptive landscape analyses. A PCA and between-group PCA are both valid methods to ordinate morphological data, yet will produce dramatically different morphospaces depending on the goals of the analysis. PCA will produce a (largely) unbiased ordination of the major axes of variation, while bgPCA will find axes of variation that maximize differences in groups. Caution should be applied when using constrained ordination like bgPCA, as they can create ecological patterns where none may actually exist (see Bookstein 2019, Cardini et al 2019 and Cardini & Polly 2020 for the cautionary debate). However, in many ways bgPCA, or other constrained ordinations are ideal for questions regarding functional and adaptive differences between ecological groups when actual morphological differences are present.

For now the adaptive landscape methods in this package are limited to 2D morphospaces, with two axes of phenotypic variation. This largely done out of the complexities involved in analyzing and visualizing multivariate covariance in more than 3 dimensions. However, if more than two axes of phenotypic variation are desired to be analyzed, these can be done in separate analyses.

Collecting performance data: Specimens or Warps

Once an ordination and 2D morphospace of the phenotypic data is defined, one must then collect data on performance of these phenotypes. There are two approaches one can take to collecting performance data: collect data directly from specimens, or collect data from phenotypic warps in morphospace. The former is the simplest and most direct as long as your dataset is not too large, and allows you to to know actual performance of actual specimens. However, error can creep in if the performance traits do not strongly covary with the axes of your morphospace, and can produce uneven and inconsistent surfaces. In addition, specimens may not evenly occupy morphospace resulting in regions of morphospace not defined by existing phenotypes, and thus will not have measured performance data and may produce erroneous interpolations or extrapolations of performance. Finally, collecting performance data can be time consuming, and may be impractical for extremely large datasets.

The alternative is to collect performance data from hypothetical warps across morphospace, which eliminates these issues. As long as the method used to define your morphospace is reversible (such as prcomp, geomorph::gm.prcomp, Morpho::prcompfast, Morpho::groupPCA, Morpho::CVA), it is possible to extract the phenotype at any location in morphospace. In fact many of these ordination functions come with predict methods for this very task. As such, using prediction, it is possible to define phenotypes evenly across all of morphospace called warps. These warps are also useful in they can be defined to represent phenotypic variation for ONLY the axes of the morphospace, and ignore variation in other axes. These warps can however form biologically impossible phenotypes (such as the walls of a bone crossing over one another), which may or may not help your interpretations of why regions of morphospace might be occupied or not.

Both these approaches are valid as long as morphospace is reasonably well covered. For an in-depth analysis of morphospace sampling see Smith et al (2021).

This package comes with the turtle humerus dataset from Dickson and Pierce (2019), which uses performance data collected from warps.

data("turtles")
data("warps")

str(turtles)
#> 'data.frame':    40 obs. of  4 variables:
#>  $ x      : num  0.03486 -0.07419 -0.07846 0.00972 -0.00997 ...
#>  $ y      : num  -0.019928 -0.015796 -0.010289 -0.000904 -0.029465 ...
#>  $ Group  : chr  "freshwater" "softshell" "softshell" "freshwater" ...
#>  $ Ecology: chr  "S" "S" "S" "S" ...
str(warps)
#> 'data.frame':    24 obs. of  6 variables:
#>  $ x    : num  -0.189 -0.189 -0.189 -0.189 -0.134 ...
#>  $ y    : num  -0.05161 -0.00363 0.04435 0.09233 -0.05161 ...
#>  $ hydro: num  -1839 -1962 -2089 -2371 -1754 ...
#>  $ curve: num  8.07 6.3 9.7 15.44 10.21 ...
#>  $ mech : num  0.185 0.193 0.191 0.161 0.171 ...
#>  $ fea  : num  -0.15516 -0.06215 -0.00435 0.14399 0.28171 ...

turtles is a dataset of coordinate data for 40 turtle humerus specimens that have been ordinated in a bgPCA morphospace, the first two axes maximizing the differences between three ecological groups: Marine, Freshwater and Terrestrial turtles. This dataset also includes these and other ecological groupings.

warps is a dataset of 4x6 evenly spaced warps predicted from this morphospace and 4 performance metrics.

Using Morphoscape

Once a morphospace is defined and performance data collected, the workflow of using Morphoscape is fairly straightforward. Using the warp and turtles datasets the first step is to make a functional dataframe using as_fnc_df(). The input to this function is a dataframe containing both coordinate data and performance data (and also grouping factors if desired). The first two columns must be coordinates, while the other columns can be defined as performance data, or as grouping factors. It is best to have your performance data named at this point to keep track.

library(Morphoscape)

data("turtles")
data("warps")

str(turtles)
#> 'data.frame':    40 obs. of  4 variables:
#>  $ x      : num  0.03486 -0.07419 -0.07846 0.00972 -0.00997 ...
#>  $ y      : num  -0.019928 -0.015796 -0.010289 -0.000904 -0.029465 ...
#>  $ Group  : chr  "freshwater" "softshell" "softshell" "freshwater" ...
#>  $ Ecology: chr  "S" "S" "S" "S" ...
str(warps)
#> 'data.frame':    24 obs. of  6 variables:
#>  $ x    : num  -0.189 -0.189 -0.189 -0.189 -0.134 ...
#>  $ y    : num  -0.05161 -0.00363 0.04435 0.09233 -0.05161 ...
#>  $ hydro: num  -1839 -1962 -2089 -2371 -1754 ...
#>  $ curve: num  8.07 6.3 9.7 15.44 10.21 ...
#>  $ mech : num  0.185 0.193 0.191 0.161 0.171 ...
#>  $ fea  : num  -0.15516 -0.06215 -0.00435 0.14399 0.28171 ...

# Create an fnc_df object for downstream use
warps_fnc <- as_fnc_df(warps, func.names = c("hydro", "curve", "mech", "fea"))
str(warps_fnc)
#> Classes 'fnc_df' and 'data.frame':   24 obs. of  6 variables:
#>  $ x    : num  -0.189 -0.189 -0.189 -0.189 -0.134 ...
#>  $ y    : num  -0.05161 -0.00363 0.04435 0.09233 -0.05161 ...
#>  $ hydro: num  0.763 0.627 0.487 0.174 0.858 ...
#>  $ curve: num  0.0544 0 0.1045 0.281 0.1202 ...
#>  $ mech : num  0.359 0.473 0.446 0 0.149 ...
#>  $ fea  : num  0.372 0.458 0.512 0.65 0.777 ...
#>  - attr(*, "func.names")= chr [1:4] "hydro" "curve" "mech" "fea"

Creating Performance Surfaces

It is then a simple process to perform surface interpolation by automatic Kriging using the krige_surf() function. This will autofit a kriging function to the data. This is performed by the automap::autoKrige() function. For details on the fitting of variograms you should read the documentation of automap. All the autoKrige fitting data is kept in the kriged_surfaces object along with the output surface.

By default the krige_surf() function will interpolate within an alpha hull wrapped around the inputted datapoints. This is to avoid extrapolation beyond measured datapoints. This can be defined using the resample_grid() function, which will supply a grid object defining the points to interpolate, and optionally plot the area to be reconstructed. The strength of wrapping can be controlled using the alpha argument, with smaller values producing a stronger wrapping.

# Create alpha-hulled grid for kriging
grid <- resample_grid(warps, hull = "concaveman", alpha = 3, plot = TRUE)

kr_surf <- krige_surf(warps_fnc, grid = grid)
#> [using ordinary kriging]
#> [using ordinary kriging]
#> [using ordinary kriging]
#> [using ordinary kriging]
kr_surf
#> A kriged_surfaces object
#> - functional characteristics:
#>  hydro, curve, mech, fea
#> - surface size:
#>  70 by 70
#>  α-hull applied (α = 3)
#> - original data:
#>  24 rows
plot(kr_surf)