The ginormal
package provides the density function and
random variable generation from the generalized inverse normal (GIN)
distribution introduced by Robert (1991). The GIN
distribution is a way to generalize the distribution of the reciprocal
of a normal random variable. That is, the distribution generalizes the
distribution of the random variable \(Z =
1/X\) where \(X \sim \text{Normal}(\mu,
\sigma^2)\). This distribution is different from the
generalized inverse Gaussian (GIG) distribution (Jørgensen,
2012) despite the similarities in naming (see below).
The GIN distribution is supported on the entire real line \(z \in (-\infty, \infty)\) and takes three parameters:
This package is the first to provide an efficient sampling algorithm for drawing from the GIN distribution. We provide similar routines for the GIN distribution truncated to the positive or negative reals. Further details of the distribution, theoretical guarantees and pseudo-code for the sampling algorithms, as well as an application to Bayesian estimation of network formation models can be found in the working paper Ding, Estrada and Montoya-Blandón (2023).
To install, type from within R
:
install.packages("ginormal")
You could also install directly from the GitHub repository. To do so
from within R
, first install the devtools
package and then type:
# install.packages("devtools") # Uncomment if devtools not installed
library(devtools)
install_github(repo = "smonto2/ginormal", subdir = "R_package", ref = "main")
Examples of how to use the ginormal
package routines are
available in the GitHub
repository.
Provided with the package are four main routines:
dgin(z, alpha, mu, tau, log = TRUE, quasi = FALSE)
dtgin(z, alpha, mu, tau, sign, log = TRUE, quasi = FALSE)
rgin(size, alpha, mu, tau, algo)
rtgin(size, alpha, mu, tau, sign, algo)
The first two compute the densities and the last two are used for
random number generation. Density routines take in the quantile
z
; parameters alpha
, mu
and
tau
; and two optional logical arguments:
log
, should the logarithm of the density be returned?
Defaults to TRUE
.quasi
, should the value of the kernel (or
quasi-density) be returned? Defaults to FALSE
.Generation routines take the same parameters but require a
size
argument determining the amount of random variates to
generate. These routines only admit a parameter alpha
larger than 2. They take an additional argument algo
, which
can be either "hormann"
or "leydold"
, and
defaults to "hormann"
as our prefered method. See below for details on both points.
Those routines including “t
” in their name work for the
truncated variants. They take an additional logical argument
sign
, where sign = TRUE
implies truncation to
positive numbers \((z > 0)\) and
sign = FALSE
to negative numbers \((z < 0)\).
Let \(Z \sim \text{GIN}(\alpha, \mu, \tau)\). The GIN density function is given by \[f_Z(z) = \frac{1}{C(\alpha, \mu, \tau)} |z|^{-\alpha}\exp\left[-\frac{1}{2\tau^2} \left( \frac{1}{z} - \mu \right)^2 \right] \equiv \frac{g(z; \alpha, \mu, \tau)}{C(\alpha, \mu, \tau)}\] where \(g(z; \alpha, \mu, \tau)\) is the kernel or quasi-density and the proportionality constant can be written in closed form as \[ C(\alpha, \mu, \tau) = (\sqrt{2} \tau)^{\alpha-1} \exp\left(- \frac{\mu^2}{2\tau^2} \right) \Gamma\left(\frac{\alpha-1}{2}\right) {}_1F_1 \left(\frac{\alpha-1}{2}; \frac{1}{2}; \frac{\mu^2}{\tau^2}\right) \]
where \(\Gamma(x)\) is the Gamma function and \({}_1F_1(a, b; x)\) is the confluent hypergeometric function. In addition to the density and generation routines for the GIN distribution, we provide similar routines for the GIN distribution truncated to positive or negative numbers. These are denoted by \(\text{GIN}^{+}\) when truncated to \((0, \infty)\) and by \(\text{GIN}^{-}\) when truncated to \((-\infty, 0)\). Let \(Z^{+} \sim \text{GIN}^{+}(\alpha, \mu, \tau)\) and \(Z^{-} \sim \text{GIN}^{-}(\alpha, \mu, \tau)\). Their densities are given by \[f_{Z^{+}}(z) = \frac{g(z; \alpha, \mu, \tau)}{C^{+}(\alpha, \mu, \tau)} \mathbb{I}(z > 0)\] \[f_{Z^{-}}(z) = \frac{g(z; \alpha, \mu, \tau)}{C^{-}(\alpha, \mu, \tau)} \mathbb{I}(z < 0)\] with proportionality constants \[C^{+}(\alpha, \mu) = e^{-\frac{\mu^2}{4}} \Gamma(\alpha - 1) D_{-(\alpha-1)}(-\mu)\] \[C^{-}(\alpha, \mu) = e^{-\frac{\mu^2}{4}} \Gamma(\alpha - 1) D_{-(\alpha-1)}(\mu)\] where \(\mathbb{I}(\cdot)\) is the indicator function that is 1 when its argument is true and 0 otherwise, and \(D_\nu(x)\) is the parabolic cylinder function. 1
Ding, Estrada and Montoya-Blandón (2023) provide an efficient sampling algorithm for the GIN distribution and its truncated variants for the case of \(\alpha > 2\). This restriction is not of concern if the goal is the perform Bayesian estimation using this distribution (see below for more details and Remark 2 in the paper). Generation is done using the ratio-of-uniforms method with mode shift (Kinderman and Monahan, 1977), which requires the computation of the minimal bounding rectangle. We implement two alternatives found in the literature:
While the kernels — and therefore the sampling techniques — for the GIN and GIG distribution are similar, these two distribution share some important differences. The main is their conceptualization, as they both attempt to generalize the idea of an inverse normal distribution in different ways. The GIG distribution does so by choosing cumulants that are inverses to those of the normal distribution. The GIN distribution does so by directly using the density of the reciprocal after a change of variables. Another important difference comes from their use as conjugate priors in Bayesian analysis:
These are both mixture models with a similar structure but carry different interpretations and thus require different posterior sampling algorithms. This interpretation also shows why the restriction of \(\alpha \geq 2\) is not binding if the goal is to perform Bayesian analysis. A prior \(\theta \sim \text{GIN}(\alpha_0, \mu_0, \tau_0)\) with \(\alpha_0 = 1 + \varepsilon\) is non-informative when \(\varepsilon > 0\) is arbitrarily small. However, the posterior distribution will have degrees-of-freedom parameter \(\alpha_N = N + 1 + \varepsilon\) where \(N\) is the sample size. As \(N \geq 1\) implies \(\alpha_N > 2\), for a conjugate Bayesian analysis we are always drawing from the GIN distribution with \(\alpha > 2\).
In R, package BAS
contains the confluent hypergeometric function. For the parabolic
cylinder function, we use a Fortran subroutine provided in the SPECFUN
library (Zhang and Jin, 1996) and our own R translation
of this function.↩︎