\(\newcommand{\P}{\mathbb{P}}\) \(\newcommand{\E}{\mathbb{E}}\) \(\newcommand{\R}{\mathbb{R}}\) \(\newcommand{\N}{\mathbb{N}}\) \(\newcommand{\bs}{\boldsymbol}\) \(\newcommand{\upa}{\uparrow}\) \(\newcommand{\Upa}{\Uparrow}\) \(\newcommand{\var}{\text{var}}\)
  1. Reliability
  2. 8. The Subset Semigroup
  3. 1
  4. 2
  5. 3
  6. 4
  7. 5
  8. 6

3. Exponential Distributions

In this section, we explore the existence of exponential distributions for the semigroup \((S, \cdot)\) and the various sub-semigroups that were introduced in Section 2. Here is the first main result:

There are no memoryless distributions for \((S, \cdot)\), and hence no exponential distributions.

Details:

Suppose that \(X\) is a random variable in \(S\), so that \(X\) is a random, finite subset of \(\N_+\). By our basic support assumption, note that \(\P(i \in X) = \P(\{i\} \subseteq X) \gt 0\) for every \(i \in \N_+\). Suppose that \(X\) has a memoryless distribution for \((S, \cdot)\). Then for every \(i \in \N_+\) we have \begin{align*} \P(\{i\} \{i\} \subseteq X) & = \P(i \in X) \P(i \in X)\\ \P(\{i + 1\} \{i\} \subseteq X) & = \P(i + 1 \in X) \P(i \in X) \end{align*} But \(\{i\} \{i\} = \{i + 1\} \{i\} = \{i, i + 1\}\) by definition of the semigroup operation, so we must have \[\P(i + 1 \in X) = \P(i \in X)\] for every \(i \in \N_+\). Next, note that if \(i_1, \, i_2, \, \ldots, i_n \in \N_+\) with \(i_1 \lt i_2 \lt \cdots \lt i_n\) then by another application of the memoryless property, \begin{align*} \P(i_1 \in X, i_2 \in X, \ldots, i_n \in X) &= \P(\{i_1, i_2, \ldots, i_n \} \subseteq X) \\ &= \P(\{i_n\} \{i_{n - 1}\} \cdots \{i_1\} \subseteq X) \\ &= \P(i_1 \in X) \P(i_2 \in X) \cdots \P(i_n \in X) \end{align*} It follows that the events \(\left\{\{i \in X\}: i \in \N_+\right\}\) are independent with a common positive probability. By the Borel-Cantelli lemma, infinitely many of the events must occur with probability 1, so \(X\) is infinite—a contradiction.

Here's another proof that there are no exponential distributions. Suppose that \(X\) has an exponential distribution for \((S, \cdot)\) with reliability function \(F\). Then \(X\) is memoryless so \(F(x y) = F(x) F(y)\) for \(x, \, y \in S\). But then \(\varphi = \ln F\) is a homomorphism from \((S, \cdot)\) into \((\R, +)\) and so by the dimension result \((S, \cdot)\) in Section 1, \(\ln F = -c \#\) for some \(c \in (0, \infty)\). Equivalently, \(F(x) = e^{-c \#(x)}\) for \(x \in S\). But then \(\sum_{x \in S} F(x) = \infty\) so \(F\) cannot be normalized into a probability density function.

From Section 2 recall that \((S_k, \cdot)\) is a complete, positive sub-semigroup of \((S, \cdot)\) for each \(k \in \N\) where \(S_k = \{x \in S: \max(x) - \#(x) = k\}\). Although there are no exponential distributions on \(S\), we will show that the sub-semigroup \((S_k, \cdot)\) has a one-parameter family of exponential distributions for each \(k \in \N\). The case \(k = 0\) is trivial since \((S_0, \cdot)\) is isomorphic to \((\N, +)\).

The exponential distribution for \((S_0, \cdot)\) with rate \(\alpha \in (0, 1)\) has distribution function \(F\) and probability density function \(f\) of the form \(F(x) = (1 - \alpha)^{\#(x)}\) and \(f(x) = \alpha (1 - \alpha)^{\#(x)}\) for \(x \in S_0\). Equivalently, \(\#(X)\) has the has the geometric distribution on \(\N\) with success parameter \(\alpha\).

The next result concerns the interesting case when \(k \in \N_+\).

For \(k \in \N_+\), random variable \(X\) has an exponential distribution for \((S_k, \cdot)\) if and only if \(X\) has probability density function \(f\) of the following form, for some \(\beta \in (0, 1)\): \[ f(x) = \beta^{k + 1} (1 - \beta)^{\#(x) - 1}, \quad x \in S_k \] The rate constant is \(\alpha = \beta^{k + 1} / (1 - \beta)\).

Details:

The function \(F(x) = (1 - \beta)^{\#(x)}\) takes values in \((0, 1)\) and satisfies \(F(x y) = F(x)F(y)\) for all \(x, \, y \in S_k\), since \(\#(x y) = \#(x) + \#(y)\). Moreover \begin{align*} \sum_{x \in S_k} F(x) &= \sum_{n = 1}^\infty \sum_{x \in S_{n, k}} F(x) = \sum_{n = 1}^\infty \sum_{x \in S_{n, k}} (1 - \beta)^n \\ &= \sum_{n = 1}^\infty \binom{n + k - 1}{n - 1} (1 - \beta)^n = \frac{1 - \beta}{\beta^{k + 1}} \end{align*} It follows from general results in Section 2.5 that \(f\) given above is the probability density function of an exponential distribution with rate \(\beta^{k+1} / (1 - \beta)\).

Conversely, suppose now that \(F\) is a reliability function on \(S_k\) with the memoryless property. Then \(\varphi = \ln F\) is a homomorphism from \((S_k, \cdot)\) into \((\R, +)\). By the dimension result for \((S_k, \dot)\) in Section 1, \(\ln F = c \#\) where \(c = \ln[F(\{k+1\}]\). Hence \(F(x) = (1 - \beta)^{\#(x)}\) where \(\beta = 1 - F(\{k + 1\}) \in (0, 1)\).

For \(k \in \N_+\) and \(\beta \in (0, 1)\), we will refer to the distribution in as the exponential distribution on \((S_k, \cdot)\) with parameter \(\beta\). The density function of \(X\) depends on \(x \in S_k\) only through \(N = \#(x)\), so it is natural to study the distribution of \(N\). Of course, by definition \(\max(X) = N + k\) on \(S_k\), so the distribution of \(N\) determines the distribution of \(\max(X)\). Once again, the case \(k = 0\) is trivial. If \(X\) has an exponential distribution on \((S_0, \cdot)\) then \(N\) has a geometric distribution on \(\N\), that is, an exponential distribution on \((\N, +)\).

Suppose that \(k \in \N_+\) and that \(X\) has the exponential distribution on \((S_k, \cdot)\) with parameter \(\beta \in (0, 1)\). Then \begin{align*} \P(N = n) &= \binom{n + k - 1}{k} \beta^{k + 1} (1 - \beta)^{n - 1}, \quad n \in \N_+ \\ \E(N) &= \frac{1 + k (1 - \beta)}{\beta} \\ \var(N) &= (k + 1) \frac{(1 - \beta)}{\beta^2} \end{align*}

Details:

Note that \(N = n\) if and only if \(X \in S_{n, k}\). Hence the formula for the probability density function follows from and the fact that \(\#(S_{n, k}) = \binom{n + k -1}{k}\). Note next that \(N - 1\) has the negative binomial distribution on \(\N\) with stopping parameter \(k + 1\) and success parameter \(\beta\). The formulas for the mean and variance then follow from standard results.

Suppose that \(k \in \N\) and that \(X\) has the exponential distribution on \((S_k, \cdot)\) with parameter \(\beta \in (0, 1)\). For \(n \in \N_+\), the conditional distribution of \(X\) given \(N = n \in \N\) is uniform \(S_{n,k}\).

Details:

From the probability density functions of \(X\) in and \(N\) in we have \[\P(X = x \mid N = n) = \frac{\P(X = x)}{\P(N = n)} = \frac{1}{\binom{n + k - 1}{n - 1}}, \quad x \in S_{n,k}\]

Open the simulation of the exponential distribution on \((S_k, \cdot)\) with parameter \(\beta\). The parameters \(k\) and \(\beta\) can be varied with scrollbars, and the rate constant \(\alpha\) is also shown. The probability density function of \(N\) is shown in the distribution plot and table. Vary the parameters and note the shape and location of the probability density function. Run the simulation for various values of the parameters and note the random subsets generated. Compare the empricial density function to the probability density functioon.

It is easy to see from the Proposition that for each \(k \in \N_+\), \(\E(N)\) is a strictly increasing function of \(\beta\) and maps \((0, 1)\) onto \((1, \infty)\). Thus, the exponential distribution on \(S_k\) can be re-parameterized by expected cardinality. Moreover, the exponential distribution maximizes entropy with respect to this parameter:

Suppose that \(k \in \N\) and that \(X\) has the exponential distribution on \((S_k, \cdot)\) with parameter \(\beta \in (0, 1)\). Then \(X\) maximizes entropy over all random variables \(Y\) on \(S_k\) with \[\E[\#(Y)] = \E[\#(X)] = \frac{1 + k (1 - \beta)}{\beta}\]

Details:

We use the usual inequality for entropy in Section 1.3: if \(f\) and \(g\) are the probability density functions of \(X\) and \(Y\), respectively, then \[ -\sum_{x \in S_k} g(x) \ln[g(x)] \leq -\sum_{x \in S_k} g(x) \ln[f(x)] \] If \(X\) has the exponential distribution with parameter \(\beta\), and \(\E[\#(Y)] = \E[\#(X)]\) then substituting into the right side of the equation above we see that the entropy of \(Y\) is bounded above by \[-(k + 1) \ln \beta + \ln (1 - \beta) - \ln(1 - \beta) \frac{1 + k (1 - \beta)}{\beta}\] Of course, the entropy of \(X\) achieves this upper bound.

Here are a couple of open problems:

Suppose that \(k \in \N_+\) and that \(\bs{Y} = (Y_1, Y_2, \ldots)\) is the random walk on \((S_k, \cdot)\) associated with the exponential distribution with parameter \(\beta \in (0, 1)\). Find a closed form expression for the density function \(f_n\) of \(Y_n\) for \(n \in \{2, 3, \ldots\}\). This is equivalent to the problem in Section 2 of finding the path function of order \(n - 1\) for the graph \((S_k, \subset)\).

Suppose that \(k \in \N_+\) and that \(X\) has an exponential distribution on \((S_k, \cdot)\) with parameter \(\beta \in (0, 1)\). Compute the hitting probability function on \(S\). That is, compute \(\P(X \cap x \neq \emptyset)\) for \(x \in S\).

For \(k \in \N_+\), the exponential distributions for \((S_k, \cdot)\) also have constant rate for the various graphs associated with the semigroup: the strict partial order graph \((S_k, \subset)\), the partial order graph \((S_k, \subseteq)\), the covering graph \((S_, \upa)\) of \((S_k, \subseteq)\), and the reflexive closure \((S_k, \Upa)\) of \((S_k, \upa)\). We have seen this pattern before, with the standard discrete semigroup in Section 5.1 and the free semigroup in Section 5.3.

Suppose that \(k \in \N_+\) and that \(X\) has the exponential distribution on \((S_k, \cdot)\) with parameter \(\beta \in (0, 1)\).

  1. For the graph \((S_k, \subset)\), \(X\) has constant rate \((1 - \beta)^{k+1} / \beta\) and reliability function \(F\) given by \[F(x) = (1 - \beta)^{\#(x)}, \quad x \in S_k\]
  2. For the graph \((S_k, \subseteq)\), \(X\) has constant rate \((1 - \beta)^{k+1} / [\beta + (1 - \beta)^{k+1}]\) and reliability function \(F\) given by \[F(x) = [1 - \beta + \beta^{k+1}] (1 - \beta)^{\#(x) - 1}, \quad x \in S_k\]
  3. For the graph \((S_k, \upa)\), \(X\) has constant rate \(1 / [(1 - \beta) (2 - \beta)^k]\) and reliability function \(F\) given by \[F(x) = (2 - \beta)^k \beta^{k+1} (1 - \beta)^{\#(x)}, \quad x \in S_k\]
  4. For the graph \((S_k, \Upa)\), \(X\) has constant rate \(1 / [(1 - \beta) (2 - \beta)^k + 1]\) and reliability function \(F\) given by \[F(x) = [(1 - \beta) (2 - \beta)^k + 1] \beta^{k+1} (1 - \beta)^{\#(x) - 1}, \quad x \in S_k\]
Details:
  1. This follows immediately from the since the relation associated with the strict positive semigroup \((S_k, \cdot)\) is the strict partial order \(\subset\).
  2. This follows from (a) and standard results for reflexive closure in Section 1.6.
  3. As before, let \(M_k\) denote the set of irreducible elements of \((S_k, \cdot)\) and let \(M_{n, k} = \{y \in M_k: \#(y) = n\}\) for \(n \in \{1, 2, \ldots, k + 1\}\). As noted in Section 2, \(\{M_{n, k}: n \in \{1, 2, \ldots, k + 1\}\}\) partitions \(M_k\) and \(\#(M_{n, k}) = \binom{k}{n - 1}\). So the reliability function \(F\) of \(X\) for \((S_k, \upa)\) is given by \begin{align*} F(x) &= \sum_{y \in M_k} f(x \cdot y) = \sum_{n = 1}^{k + 1} \sum_{y \in M_{n, k}} f(x \cdot y) = \sum_{n = 1}^{k + 1} \beta^{k + 1} (1 - \beta)^{\#(x) + n - 1} \binom{k}{n - 1}\\ &= f(x) \sum_{n = 1}^{k + 1} (1 - \beta)^n \binom{k}{n - 1} = f(x) (1 - \beta) (2 - \beta)^k \end{align*}
  4. This follows from (c) and standard results for reflexive closure.

For \(k \in \N\), recall that \((T_k, \cdot)\) is a positive semigroup where \(T_k = S_k \cup \{\emptyset\}\).

Random variable \(X\) has an exponential distribution on \((T_k, \cdot)\) if and only if \(X\) has probability density function \(f\) of the following form, for some \(\beta \in (0, 1)\); \[f(x) = \frac{\beta^{k+1}}{1 - \beta + \beta^{k+1}} (1 - \beta)^{\#(x)}, \quad x \in T_k\] The rate constant is \( \beta^{k + 1} / [1 - \beta + \beta^{k + 1}] \).

Details:

This result follows from a proof essentially like the one in the details of .

Note that \((T_0, \cdot) = (S_0, \cdot) \) and in this case of course, \( f(x) = \beta (1 - \beta)^{\#(x)} \) for \(x \in S_0\) as noted in . When \(k \in \N_+\) note that \(S_k = T_k^+\). In this case, if \(X\) has the exponential distribution on \((T_k, \cdot)\) then the conditional distribution of \(X\) given \(X \in S_k\) has the exponential distribution on \((S_k, \cdot)\) as required by the results in Section 2.6.

For \(k \in \N_+\), determine whether the exponential distribution on \((T_k, \cdot)\) with parameter \(\beta \in (0, 1)\) is compound Poisson (and therefore infinitely divisible).

Our final results concern the sub-semigroup \((U_k, \cdot)\) where \(U_k = \{x \in S: \min(x) = k\}\) for \(k \in \N_+\). Recall that \((U_k, \cdot)\) is isomorphic to \((U_1, \cdot)\) for \(k \in \N_+\), so we really only need to consider \((U_1, \cdot)\).

There are no exponential distributions for the semigroup \((U_1, \cdot)\) (and hence no exponential distributions for the semigroup \((U_k, \cdot)\) for \(k \in \N_+\)).

Details:

Suppose that \(X\) is a random variable in \(U_1\) and reliability function \(F\) with respect to \((U_1, \cdot)\). If \(X\) has an exponential distribution, then the distribution is memoryless and so \(\ln F\) is a homomorphism from \((U_1, \cdot)\) into \((\R, +)\). By the dimension result in Section 1, \(F(x) = [F(\{1\})]^{\#(x)}\) for \(x \in U_1\). Let \(U_{1, n} = \{x \in U_1: \#(x) = n\}\) for \(n \in \N_+\). Then \[\sum_{x \in U_1} F(x) = \sum_{n = 1}^\infty \sum_{x \in U_{1, n}} F(x) = \sum_{n = 1}^\infty [F(\{1\})]^n \#(U_{1, n}) = \infty\] since \(\#(U_{1 n}) = \infty\) for \(n \in \{2, 3, \ldots\}\). Hence \(F\) cannot be normalized into a probability density function.