Exponential Distributions

In this section, we explore the existence of exponential distributions for the semigroup \((S, \cdot)\) and the various sub-semigroups that were introduced in Section 4. Here is the first main result:

There are no memoryless distributions for \((S, \cdot)\), and hence no exponential distributions.

Details:

Suppose that \(X\) is a random variable in \(S\), so that \(X\) is a random, finite subset of \(\N_+\). By our basic support assumption, note that \(\P(i \in X) = \P(\{i\} \subseteq X) \gt 0\) for every \(i \in \N_+\). Suppose that \(X\) has a memoryless distribution for \((S, \cdot)\). Then for every \(i \in \N_+\) we have \begin{align*} \P(\{i\} \{i\} \subseteq X) & = \P(i \in X) \P(i \in X)\\ \P(\{i + 1\} \{i\} \subseteq X) & = \P(i + 1 \in X) \P(i \in X) \end{align*} But \(\{i\} \{i\} = \{i + 1\} \{i\} = \{i, i + 1\}\) by definition of the semigroup operation, so we must have \[\P(i + 1 \in X) = \P(i \in X)\] for every \(i \in \N_+\). Next, note that if \(i_1, \, i_2, \, \ldots, i_n \in \N_+\) with \(i_1 \lt i_2 \lt \cdots \lt i_n\) then by another application of the memoryless property, \begin{align*} \P(i_1 \in X, i_2 \in X, \ldots, i_n \in X) &= \P(\{i_1, i_2, \ldots, i_n \} \subseteq X) \\ &= \P(\{i_n\} \{i_{n - 1}\} \cdots \{i_1\} \subseteq X) \\ &= \P(i_1 \in X) \P(i_2 \in X) \cdots \P(i_n \in X) \end{align*} It follows that the events \(\left\{\{i \in X\}: i \in \N_+\right\}\) are independent with a common positive probability. By the Borel-Cantelli lemma, infinitely many of the events must occur with probability 1, so \(X\) is infinite—a contradiction.

Here's another proof that there are no exponential distributions. Suppose that \(X\) has an exponential distribution for \((S, \cdot)\) with reliability function \(F\). Then \(X\) is memoryless so \(F(x y) = F(x) F(y)\) for \(x, \, y \in S\). But then \(\varphi = \ln F\) is a homomorphism from \((S, \cdot)\) into \((\R, +)\) and so by the dimension result \((S, \cdot)\) in Section 4, \(\ln F = -c \#\) for some \(c \in (0, \infty)\). Equivalently, \(F(x) = e^{-c \#(x)}\) for \(x \in S\). But then \(\sum_{x \in S} F(x) = \infty\) so \(F\) cannot be normalized into a probability density function.

Subsemigroups

From Section 4 recall that \((S_k, \cdot)\) is a complete, positive sub-semigroup of \((S, \cdot)\) for each \(k \in \N\) where \(S_k = \{x \in S: \max(x) - \#(x) = k\}\). Although there are no exponential distributions on \(S\), we will show that the sub-semigroup \((S_k, \cdot)\) has a one-parameter family of exponential distributions for each \(k \in \N\). The case \(k = 0\) is trivial since \((S_0, \cdot)\) is isomorphic to \((\N, +)\).

The exponential distribution for \((S_0, \cdot)\) with rate \(\alpha \in (0, 1)\) has distribution function \(F\) and probability density function \(f\) of the form \(F(x) = (1 - \alpha)^{\#(x)}\) and \(f(x) = \alpha (1 - \alpha)^{\#(x)}\) for \(x \in S_0\). Equivalently, \(\#(X)\) has the has the geometric distribution on \(\N\) with success parameter \(\alpha\).

The next result concerns the interesting case when \(k \in \N_+\). Recall from Section 1 the sets \(S_{n, n + k} = \{x \in S: \#(x) = n, \max(x) = n + k\} = \{x \in S_k: \#(x) = n\}\), defined for \(n \in \N_+\).

For \(k \in \N_+\), random variable \(X\) has an exponential distribution for \((S_k, \cdot)\) if and only if \(X\) has probability density function \(f\) of the following form, for some \(\beta \in (0, 1)\): \[ f(x) = \beta^{k + 1} (1 - \beta)^{\#(x) - 1}, \quad x \in S_k \] The rate constant is \(\alpha = \beta^{k + 1} / (1 - \beta)\).

Details:

The function \(F(x) = (1 - \beta)^{\#(x)}\) takes values in \((0, 1)\) and satisfies \(F(x y) = F(x)F(y)\) for all \(x, \, y \in S_k\), since \(\#(x y) = \#(x) + \#(y)\). Moreover \begin{align*} \sum_{x \in S_k} F(x) &= \sum_{n = 1}^\infty \sum_{x \in S_{n, k}} F(x) = \sum_{n = 1}^\infty \sum_{x \in S_{n, n + k}} (1 - \beta)^n \\ &= \sum_{n = 1}^\infty \binom{n + k - 1}{n - 1} (1 - \beta)^n = \frac{1 - \beta}{\beta^{k + 1}} \end{align*} It follows from general results in Section 2.5 that \(f\) given above is the probability density function of an exponential distribution with rate \(\beta^{k+1} / (1 - \beta)\).

Conversely, suppose now that \(F\) is a reliability function on \(S_k\) with the memoryless property. Then \(\varphi = \ln F\) is a homomorphism from \((S_k, \cdot)\) into \((\R, +)\). By the dimension result for \((S_k, \dot)\) in Section 4, \(\ln F = c \#\) where \(c = \ln[F(\{k+1\}]\). Hence \(F(x) = (1 - \beta)^{\#(x)}\) where \(\beta = 1 - F(\{k + 1\}) \in (0, 1)\).

For \(k \in \N_+\) and \(\beta \in (0, 1)\), we will refer to the distribution in as the exponential distribution on \((S_k, \cdot)\) with parameter \(\beta\). The density function of \(X\) depends on \(x \in S_k\) only through \(N = \#(x)\), so it is natural to study the distribution of \(N\). Of course, by definition \(\max(X) = N + k\) on \(S_k\), so the distribution of \(N\) determines the distribution of \(\max(X)\). Once again, the case \(k = 0\) is trivial. If \(X\) has an exponential distribution on \((S_0, \cdot)\) then \(N\) has a geometric distribution on \(\N\), that is, an exponential distribution on \((\N, +)\).

Suppose that \(k \in \N_+\) and that \(X\) has the exponential distribution on \((S_k, \cdot)\) with parameter \(\beta \in (0, 1)\). Then \(N - 1\) has the negative binomial distribution on \(\N\) with stopping parameter \(k + 1\) and success parameter \(\beta\): \begin{align*} \P(N = n) &= \binom{n + k - 1}{k} \beta^{k + 1} (1 - \beta)^{n - 1}, \quad n \in \N_+ \\ \E(N) &= \frac{1 + k (1 - \beta)}{\beta} \\ \var(N) &= (k + 1) \frac{(1 - \beta)}{\beta^2} \end{align*}

Details:

Note that \(N = n\) if and only if \(X \in S_{n, n + k}\). Hence the formula for the probability density function follows from and the fact that \(\#(S_{n, n + k}) = \binom{n + k -1}{k}\). So \(N - 1\) has the negative binomial distribution on \(\N\) with stopping parameter \(k + 1\) and success parameter \(\beta\). The formulas for the mean and variance then follow from standard results.

Suppose that \(k \in \N\) and that \(X\) has the exponential distribution on \((S_k, \cdot)\) with parameter \(\beta \in (0, 1)\). For \(n \in \N_+\), the conditional distribution of \(X\) given \(N = n \in \N\) is uniform \(S_{n, n + k}\).

Details:

From the probability density functions of \(X\) in and \(N\) in we have \[\P(X = x \mid N = n) = \frac{\P(X = x)}{\P(N = n)} = \frac{1}{\binom{n + k - 1}{n - 1}}, \quad x \in S_{n,k}\]

The app below is a simulation of the exponential distribution on \((S_k, \cdot)\) with parameter \(\beta\). The parameters \(k\) and \(\beta\) can be varied with scrollbars, and the rate constant \(\alpha\) is also shown. The probability density function of \(N\) is shown in the distribution plot and table.

It is easy to see from the Proposition that for each \(k \in \N_+\), \(\E(N)\) is a strictly increasing function of \(\beta\) and maps \((0, 1)\) onto \((1, \infty)\). Thus, the exponential distribution on \(S_k\) can be re-parameterized by expected cardinality. Moreover, the exponential distribution maximizes entropy with respect to this parameter:

Suppose that \(k \in \N\) and that \(X\) has the exponential distribution on \((S_k, \cdot)\) with parameter \(\beta \in (0, 1)\). Then \(X\) maximizes entropy over all random variables \(Y\) on \(S_k\) with \[\E[\#(Y)] = \E[\#(X)] = \frac{1 + k (1 - \beta)}{\beta}\]

Details:

We use the usual inequality for entropy in Section 1.3: if \(f\) and \(g\) are the probability density functions of \(X\) and \(Y\), respectively, then \[ -\sum_{x \in S_k} g(x) \ln[g(x)] \leq -\sum_{x \in S_k} g(x) \ln[f(x)] \] If \(X\) has the exponential distribution with parameter \(\beta\), and \(\E[\#(Y)] = \E[\#(X)]\) then substituting into the right side of the equation above we see that the entropy of \(Y\) is bounded above by \[-(k + 1) \ln \beta + \ln (1 - \beta) - \ln(1 - \beta) \frac{1 + k (1 - \beta)}{\beta}\] Of course, the entropy of \(X\) achieves this upper bound.

Suppose that \(k \in \N_+\) and that \(\bs{Y} = (Y_1, Y_2, \ldots)\) is the random walk on \((S_k, \cdot)\) associated with the exponential distribution with parameter \(\beta \in (0, 1)\). Find a closed form expression for the density function \(f_n\) of \(Y_n\) for \(n \in \{2, 3, \ldots\}\). This is equivalent to the problem in Section 2 of finding the path function of order \(n - 1\) for the graph \((S_k, \subset)\).

Suppose that \(k \in \N_+\) and that \(X\) has an exponential distribution on \((S_k, \cdot)\) with parameter \(\beta \in (0, 1)\). Compute the hitting probability function on \(S\). That is, compute \(\P(X \cap x \neq \emptyset)\) for \(x \subseteq \N\).

For \(k \in \N_+\), the exponential distributions for \((S_k, \cdot)\) also have constant rate for the various graphs associated with the semigroup: the strict partial order graph \((S_k, \subset)\), the partial order graph \((S_k, \subseteq)\), the covering graph \((S_, \upa)\) of \((S_k, \subseteq)\), and the reflexive closure \((S_k, \Upa)\) of \((S_k, \upa)\). We have seen this pattern before, with the standard discrete semigroup in Section 5.1 and the free semigroup in Section 5.3.

Suppose that \(k \in \N_+\) and that \(X\) has the exponential distribution on \((S_k, \cdot)\) with parameter \(\beta \in (0, 1)\).

For the graph \((S_k, \subset)\), \(X\) has constant rate \((1 - \beta)^{k+1} / \beta\) and reliability function \(F\) given by \[F(x) = (1 - \beta)^{\#(x)}, \quad x \in S_k\]
For the graph \((S_k, \subseteq)\), \(X\) has constant rate \((1 - \beta)^{k+1} / [\beta + (1 - \beta)^{k+1}]\) and reliability function \(F\) given by \[F(x) = [1 - \beta + \beta^{k+1}] (1 - \beta)^{\#(x) - 1}, \quad x \in S_k\]
For the graph \((S_k, \upa)\), \(X\) has constant rate \(1 / [(1 - \beta) (2 - \beta)^k]\) and reliability function \(F\) given by \[F(x) = (2 - \beta)^k \beta^{k+1} (1 - \beta)^{\#(x)}, \quad x \in S_k\]
For the graph \((S_k, \Upa)\), \(X\) has constant rate \(1 / [(1 - \beta) (2 - \beta)^k + 1]\) and reliability function \(F\) given by \[F(x) = [(1 - \beta) (2 - \beta)^k + 1] \beta^{k+1} (1 - \beta)^{\#(x) - 1}, \quad x \in S_k\]

Details:

This follows immediately from the since the relation associated with the strict positive semigroup \((S_k, \cdot)\) is the strict partial order \(\subset\).
This follows from (a) and standard results for reflexive closure in Section 1.6.
As before, let \(M_k\) denote the set of irreducible elements of \((S_k, \cdot)\) and let \(M_{n, k} = \{y \in M_k: \#(y) = n\}\) for \(n \in \{1, 2, \ldots, k + 1\}\). As noted in Section 4, \(\{M_{n, k}: n \in \{1, 2, \ldots, k + 1\}\}\) partitions \(M_k\) and \(\#(M_{n, k}) = \binom{k}{n - 1}\). So the reliability function \(F\) of \(X\) for \((S_k, \upa)\) is given by \begin{align*} F(x) &= \sum_{y \in M_k} f(x \cdot y) = \sum_{n = 1}^{k + 1} \sum_{y \in M_{n, k}} f(x \cdot y) = \sum_{n = 1}^{k + 1} \beta^{k + 1} (1 - \beta)^{\#(x) + n - 1} \binom{k}{n - 1}\\ &= f(x) \sum_{n = 1}^{k + 1} (1 - \beta)^n \binom{k}{n - 1} = f(x) (1 - \beta) (2 - \beta)^k \end{align*}
This follows from (c) and standard results for reflexive closure.

For \(k \in \N\), recall that \((T_k, \cdot)\) is a positive semigroup where \(T_k = S_k \cup \{\emptyset\}\).

Random variable \(X\) has an exponential distribution on \((T_k, \cdot)\) if and only if \(X\) has probability density function \(f\) of the following form, for some \(\beta \in (0, 1)\); \[f(x) = \frac{\beta^{k+1}}{1 - \beta + \beta^{k+1}} (1 - \beta)^{\#(x)}, \quad x \in T_k\] The rate constant is \( \beta^{k + 1} / [1 - \beta + \beta^{k + 1}] \).

Details:

This result follows from a proof essentially like the one in the details of .

Note that \((T_0, \cdot) = (S_0, \cdot) \) and in this case of course, \( f(x) = \beta (1 - \beta)^{\#(x)} \) for \(x \in S_0\) as noted in . When \(k \in \N_+\) note that \(S_k = T_k^+\). In this case, if \(X\) has the exponential distribution on \((T_k, \cdot)\) then the conditional distribution of \(X\) given \(X \in S_k\) has the exponential distribution on \((S_k, \cdot)\) as required by the results in Section 2.6.

For \(k \in \N_+\), determine whether the exponential distribution on \((T_k, \cdot)\) with parameter \(\beta \in (0, 1)\) is compound Poisson (and therefore infinitely divisible).

Almost Exponential Distributions

From our work above, there are no exponential distributions on \((S, \cdot)\), but the sub-semigroup \((S_k, \cdot)\) has a one-parameter family of exponential distributions for each \(k \in \N\). We can define distributions that are close to exponential on \((S, \cdot)\) by forming mixtures of distributions that are exponential for \((S_k, \cdot)\) over \(k \in \N\).

Suppose that \(X\) takes values in \(S\) with probability density function \(f\) given by \[ f(x) = \begin{cases} p_0 \beta_0 (1 - \beta_0)^{\#(x)}, &\quad x \in S_0 \\ p_k \beta_k^{k + 1} (1 - \beta_k)^{\#(x) - 1}, &\quad x \in S_k, \, k \in \N_+ \end{cases} \] where \(\beta_k, \, p_k \in (0, 1)\) for each \(k \in \N\) and \(\sum_{k = 0}^\infty p_k = 1\). Then the conditional distribution of \(X\) given \(X \in S_k\) is exponential on \((S_k, \cdot)\), with parameter \(\beta_k\), for each \(k \in \N\).

The distribution of \(X\) is as close to exponential as possible, in the sense that \(X\) is essentially exponential on each of the sub-semigroups \(S_k\), and these semigroups partition \(S\). We can construct random variable \(X\) as a mixture in the usual way.

For \(k \in \N_+\), suppose that \(Y_k\) has the exponential distribution on \((S_k, \cdot)\) with parameter \(\beta_k \in (0, 1)\). Next, suppose that \(K\) takes values in \(\N\) with probability density function given by \(\P(K = k) = p_k\) for \(k \in \N\), and that \(K\) is independent of \(\bs{Y} = (Y_0, Y_1, \ldots)\). Then \(X = Y_K\) has the almost exponential distribution in proposition .

There is not much that we can say about the general distribution defined in . However, the special class of distributions studied in detail in Section 2 turn out to be almost exponential. Recall again that \(S_{n, m} = \{x \in S: \#(x) = n, \max(x) = m\}\) defined for \(m = n = 0\) and for \(n \in \N_+\) and \(m \in \{n, n + 1, \ldots\}\).

Consider random variable \(X\) in \(S\) defined as follows, where \(N = \#(X)\) and \(M = \max(X)\).

\(N\) has the geometric distribution on \(\N\) with success parameter \(q \in (0, 1)\).
Given \(N = 0\), \(M = 0\) also. Given \(N = n \in \N_+\), \(M\) has the negative binomial distribution on \(\{n, n + 1 \ldots\}\) with stopping parameter \(n\) and success parameter \(p \in (0, 1)\).
Given \((N = n, M = m)\) where \(m = n = 0\) or \(n \in \N_+\) and \(m \in \{n, n + 1, \ldots\}\), \(X\) is uniformly distributed on \(S_{n, m}\).

Then \(X\) has probability density function \(f\) on \(S\) given by \(f(x) = q[p(1 - q)]^{\#(x)} (1 - p)^{\max(x) - \#(x)}\) for \(x \in S\), and has the almost exponential distribution on \(S\) as defined in with \(\beta_k = 1 - p + p q\) for all \(k \in \N\) and with \[p_0 = \frac{q}{1 - p + p q}, \quad p_k = \frac{p q (1 - q)}{1 - p + p q} \left(\frac{1 - p}{1 - p + p q}\right)^k, \; k \in \N_+\]

Details:

The probability density function \(f\) of \(X\) was derived in Section 2. That \(X\) has an almost exponential distribution with the parameters given above follows by comparison with the density function in .

Note that the mixing density \(P = (p_k: k \in \N)\) defines a modified geometric distribution on \(\N\), with success parameters \(q / (1 - p + p q)\) and \(p q / (1 - p + p 1)\) as defined in Section 4.2. In this chapter, Section 2 gives lots of additional information on the almost exponential distribution in , including the the reliability and rate functions of \(X\) for \((S, \subseteq)\), the distribution of \(M\) (also a modified geometric distribution on \(\N\)), the moments of \(N\) and \(M\), the covariance and correlation of \(M\) with \(N\), the hitting probability function of \(X\), an interesting Markov property of \(X\), and alternate ways to construct the distribution.