family
objects: exponential family of distributionsThe enrichwith R package provides the enrich
method to enrich list-like R objects with new, relevant components. The resulting objects preserve their class, so all methods associated with them still apply.
This vignette is a demo of the available enrichment options for family
objects, focusing on objects that correspond to members of the exponential family of distributions.
family
objectsfamily
objects specify characteristics of the models used by functions such as glm
. The families implemented in the stats
package include binomial
, gaussian
, Gamma
, inverse.gaussian
, and poisson
, which obvious corresponding distributions. These distributions are all special cases of the exponential family of distributions with probability mass or density function of the form fY(y)=exp{yθ−b(θ)−c1(y)ϕ/m−12a(−mϕ)+c2(y)} for some sufficiently smooth functions b(.), c1(.), a(.) and c2(.), and a fixed weight m. The expected value and the variance of Y are then E(Y)=μ=b′(θ)Var(Y)=ϕmb″ where V(\mu) and \phi are the variance function and the dispersion parameter, respectively. Below we list the characteristics of the distributions supported by family
objects.
\theta = \mu, \displaystyle b(\theta) = \frac{\theta^2}{2}, \displaystyle c_1(y) = \frac{y^2}{2}, \displaystyle a(\zeta) = -\log(-\zeta), \displaystyle c_2(y) = -\frac{1}{2}\log(2\pi)
\displaystyle \theta = \log\frac{\mu}{1- \mu}, \displaystyle b(\theta) = \log(1 + e^\theta), \displaystyle \phi = 1, \displaystyle c_1(y) = 0, \displaystyle a(\zeta) = 0, \displaystyle c_2(y) = \log{m\choose{my}}
\displaystyle \theta = \log\mu, \displaystyle b(\theta) = e^\theta, \displaystyle \phi = 1, \displaystyle c_1(y) = 0, \displaystyle a(\zeta) = 0, \displaystyle c_2(y) = -\log\Gamma(y + 1)
\displaystyle \theta = -\frac{1}{\mu}, \displaystyle b(\theta) = -\log(-\theta), \displaystyle c_1(y) = -\log y, \displaystyle a(\zeta) = 2 \log \Gamma(-\zeta) + 2 \zeta \log\left(-\zeta\right), \displaystyle c_2(y) = -\log y
\displaystyle \theta = -\frac{1}{2\mu^2}, \displaystyle b(\theta) = -\sqrt{-2\theta}, \displaystyle c_1(y) = \frac{1}{2y}, \displaystyle a(\zeta) = -\log(-\zeta), \displaystyle c_2(y) = -\frac{1}{2}\log\left(\pi y^3\right)
family
objectsfamily
objects provide functions for the variance function (variance
), a specification of deviance residuals (dev.resids
) and the Akaike information criterion (aic
). For example
## function (y, mu, wt)
## wt * ((y - mu)^2)/(y * mu^2)
## <bytecode: 0x7fbe690297d0>
## <environment: 0x7fbe6902f4c0>
## function (mu)
## mu^3
## <bytecode: 0x7fbe69029990>
## <environment: 0x7fbe5f0335b0>
## function (y, n, mu, wt, dev)
## sum(wt) * (log(dev/sum(wt) * 2 * pi) + 1) + 3 * sum(log(y) *
## wt) + 2
## <bytecode: 0x7fbe69029108>
## <environment: 0x7fbe5f08ea50>
family
objectsThe enrichwith R package provides methods for the enrichment of family
objects with a function that links the natural parameter \theta with \mu, the function b(\theta), the first two derivatives of V(\mu), a(\zeta) and its first four derivatives, and c_1(y) and c_2(y). To illustrate, let’s write a function that reconstructs the densities and probability mass functions from the components that result from enrichment
library("enrichwith")
dens <- function(y, m = 1, mu, phi, family) {
object <- enrich(family)
with(object, {
c2 <- if (family == "binomial") c2fun(y, m) else c2fun(y)
exp(m * (y * theta(mu) - bfun(theta(mu)) - c1fun(y))/phi -
0.5 * afun(-m/phi) + c2)
})
}
The following chunks test dens
for a few distributions against the standard d*
functions
## Normal
all.equal(dens(y = 0.2, m = 3, mu = 1, phi = 3.22, gaussian()),
dnorm(x = 0.2, mean = 1, sd = sqrt(3.22/3)))
## [1] TRUE
## Gamma
all.equal(dens(y = 3, m = 1.44, mu = 2.3, phi = 1.3, Gamma()),
dgamma(x = 3, shape = 1.44/1.3, 1.44/(1.3 * 2.3)))
## [1] TRUE
## Inverse gaussian
all.equal(dens(y = 0.2, m = 7.23, mu = 1, phi = 3.22, inverse.gaussian()),
SuppDists::dinvGauss(0.2, nu = 1, lambda = 7.23/3.22))
## [1] TRUE
## Binomial
all.equal(dens(y = 0.34, m = 100, mu = 0.32, phi = 1, binomial()),
dbinom(x = 34, size = 100, prob = 0.32))
## [1] TRUE