The goal of GREMLINS is to perform statistical analysis of multipartite networks through a block model approach.
Multipartite networks consist in the joint observation of several networks implying some common individuals. The individuals (or entities represented by nodes) at stake are partitioned into groups defined by their nature. In what follows, these groups will be referred to as functional groups.
library(GREMLINS)
The model is introduced and described in Bar-Hen, Barbillon, and Donnet (2018).
Assume that Q functional groups of individuals are at stake. A multipartite network is a collection of networks, each of them implying one or two functional group. Thus, each network may be
- *simple* if it represents the relations inside a functional group
- *bipartite* if it represent the relations between individuals of two functional groups.
We index the collection of networks by pairs of functional groups (q,q′). The set E denotes the list of pairs of functional groups for which we observe an interaction network.
For any pair (q,q′)∈E, the interaction network is encoded in a matrix Xqq′ such that Xqq′ii′≠0 if there is an edge from unit i of functional group q to unit i′ of functional group q′, Xqq′ii′=0 otherwise.
For any (q,q′), Xqq′ii′ may be in {0,1} or a numeric for weigthed networks.
Note that, if q≠q′, Xqq′ is said to be an incidence matrix (corresponding to a bipartite network). If q=q′, Xqq is an adjacency matrix. Moreover, if the relation inside the functional group q is non-oriented, Xqq is symmetric.
Let nq be the number of individuals in the q-th functional group. Assume that, each functional group q is divided into Kq blocks or equivalently clusters. ∀q and ∀i, let Zqi be the latent random variable such that Zqi=k if individual i of functional group q belongs to cluster k. The random variables Zqi’s are assumed to be independent and such that: ∀(i,k,q)∈{1,…,nq}×{1,…,Kq}×{1,…,Q}:
P(Zqi=k)=πqk, with ∑Kqk=1πqk=1, ∀q=1,…,Q.
Conditionally on the clustering, the entries of the matrices (Xqq′ii′) are assumed to be independent and distributed as follows: ∀(i,i′)∈{1,…,nq}×{1,…,nq′}, Xqq′ii′|Zqi=k,Zq′i′=k′∼i.i.dFqq′(θqq′kk′) meaning that the probability of connection from i of functional group q to i′ of functional group q′ only depends on the clusters to which they belong to.
For any pair (q,q′), Fqq′(⋅) is either:
- Bernoulli, resulting into binary interactions
- Poisson for weighted networks of counts
- Gaussian or Laplace for continuous weighted networks.
As a consequence, the collection of networks may contain weighted and/or binary networks.
The inference of the model consists in the selection of the numbers of clusters (Kq)q=1,…,Q and the estimation of the parameters (θqq′). The model selection is performed with the ICL, a penalized likelihood criterion. The parameters are estimated with a varitional version of the EM algorithm. The estimation procedure also provides a clustering of the entities at stake.