\(\newcommand{\P}{\mathbb{P}}\)
\(\newcommand{\R}{\mathbb{R}}\)
\( \newcommand{\N}{\mathbb{N}} \)
\(\newcommand{\E}{\mathbb{E}}\)
\(\newcommand{\var}{\text{var}}\)
\(\newcommand{\sd}{\text{sd}}\)
\(\newcommand{\skw}{\text{skew}}\)
\(\newcommand{\kur}{\text{kurt}}\)
\( \newcommand{\bs}{\boldsymbol} \)

- Random
- 4. Special Distributions
- General Exponential Families

Suppose that \(X\) is random variable taking values in \(S\), and that the distribution of \(X\) depends on an unspecified parameter \(\theta\) taking values in a parameter space \(\Theta\). In general, both \(X\) and \(\theta\) may be vector-valued. That is, as usual, \( S \) is typically either a countable set or a subset of \( \R^n \) for some \( n \in \N \), and the same statement holds for \( \Theta \). So the distribution of \( X \) might be discrete or continuous. In either case, let \(f_\theta\) denote the probability density function of \(X\) on \(S\), corresponding to \(\theta \in \Theta\).

The distribution of \(X\) is a \(k\)-parameter exponential family if \(S\) does not depend on \(\theta\) and if the probability density function can be written as \[ f_\theta(x) = \alpha(\theta) \, g(x) \, \exp \left( \sum_{i=1}^k \beta_i(\theta) \, h_i(x) \right); \quad x \in S, \; \theta \in \Theta \] where \(\alpha\) and \(\left(\beta_1, \beta_2, \ldots, \beta_k\right)\) are real-valued functions on \(\Theta\), and where \(g\) and \(\left(h_1, h_2, \ldots, h_k\right)\) are real-valued functions on \(S\). Moreover, \(k\) is assumed to be the smallest such integer.

The parameters \(\left(\beta_1(\theta), \beta_2(\theta), \ldots, \beta_k(\theta)\right)\) are sometimes called natural parameters of the distribution, and the random variables \(\left(h_1(X), h_2(X), \ldots, h_k(X)\right)\) are sometimes called natural statistics of the distribution. Although the definition may look intimidating, exponential families are useful because many important theoretical results in statistics hold for exponential families, and because many special parametric families of distributions turn out to be exponential families.

The next result shows that if we sample from the distribution of an exponential family, then the distribution of the random sample is itself an exponential family with the same natural statistics.

Suppose that the distribution of random variable \(X\) is a \(k\)-parameter exponential family with natural parameters \((\beta_1(\theta), \beta_2(\theta), \ldots, \beta_k(\theta))\), and natural statistics \((h_1(X), h_2(X), \ldots, h_k(X))\). Let \(\boldsymbol{X} = (X_1, X_2, \ldots, X_n)\) be a sequence of \(n\) independent random variables, each with the same distribution as \(X\). Then \(\boldsymbol{X}\) is a \(k\)-parameter exponential family with natural parameters \((\beta_1(\theta), \beta_2(\theta), \ldots, \beta_k(\theta))\), and natural statistics \[ u_j(\boldsymbol{X}) = \sum_{i=1}^n h_j(X_i), \quad j \in \{1, 2, \ldots, k\} \]

Many of the special distributions studied in this chapter are general exponential families, at least with respect to some of their parameters. On the other hand, most commonly, a parametric family fails to be a general exponential family because the support set depends on the parameter. The following theorems give a number of examples. Proofs will be provided in the individual sections.

The Bernoulli distribution is a one parameter exponential family in the success parameter \( p \in [0, 1] \)

The beta distiribution is a two-parameter exponential family in the shape parameters \( a \in (0, \infty) \), \( b \in (0, \infty) \).

The beta prime distribution is a two-parameter exponential family in the shape parameters \( a \in (0, \infty) \), \( b \in (0, \infty) \).

The binomial distribution is a one-parameter exponential family in the success parameter \( p \in [0, 1] \) for a fixed value of the trial parameter \( n \in \N_+ \).

The chi-square distribution is a one-parameter exponential family in the degrees of freedom \( n \in (0, \infty) \).

The exponential distribution is a one-parameter exponential family (appropriately enough), in the rate parameter \( r \in (0, \infty) \).

The gamma distribution is a two-parameter exponential family in the shape parameter \( k \in (0, \infty) \) and the scale parameter \( b \in (0, \infty) \).

The geometric distribution is a one-parameter exponential family in the success probability \( p \in (0, 1) \).

The half normal distribution is a one-parameter exponential family in the scale parameter \( \sigma \in (0, \infty) \)

The Laplace distribution is a one-parameter exponential family in the scale parameter \( b \in (0, \infty) \) for a fixed value of the location parameter \( a \in \R \).

The Lévy distribution is a one-parameter exponential family in the scale parameter \( b \in (0, \infty) \) for a fixed value of the location parameter \( a \in \R \).

The logarithmic distribution is a one-parameter exponential family in the shape parameter \( p \in (0, 1) \)

The lognormal distribution is a two parameter exponential family in the shape parameters \( \mu \in \R \), \( \sigma \in (0, \infty) \).

The Maxwell distribution is a one-parameter exponential family in the scale parameter \( b \in (0, \infty) \).

The \( k \)-dimensional multinomial distribution is a \( k \)-parameter exponential family in the probability parameters \( (p_1, p_2, \ldots, p_k) \) for a fixed value of the trial parameter \( n \in \N_+ \).

The \( k \)-dimensional multivariate normal distribution is a \( \frac{1}{2}(k^2 + 3 k) \)-parameter exponential family with respect to the mean vector \( \bs{\mu} \) and the variance-covariance matrix \( \bs{V} \).

The negative binomial distribution is a one-parameter exponential family in the success parameter \( p \in (0, 1) \) for a fixed value of the stopping parameter \( k \in \N_+ \).

The normal distribution is a two-parameter exponential family in the mean \( \mu \in \R \) and the standard deviation \( \sigma \in (0, \infty) \).

The Pareto distribution is a one-parameter exponential family in the shape parameter for a fixed value of the scale parameter.

The Poisson distribution is a one-parameter exponential family.

The Rayleigh distribution is a one-parameter exponential family.

The U-power distribution is a one-parameter exponential family in the shape parameter, for fixed values of the location and scale parameters.

The Weibull distribution is a one-parameter exponential family in the scale parameter for a fixed value of the shape parameter.

The zeta distribution is a one-parameter exponential family.

The Wald distribution is a two-parameter exponential family.