Almost Exponential Distributions

From Section 3, there are no exponential distributions on \((S, \cdot)\), but the sub-semigroup \((S_k, \cdot)\) has a one-parameter family of exponential distributions for each \(k \in \N\). Recall again that \(S\) is the collection of finite subsets of \(\N_+\) and that the semigroup operator \(\cdot\) generates the subset partial order \(\subseteq\). Recall also that \(S_k = \{x \in S: \max(x) - \#(x) = k\}\) for \(k \in \N\). We can define distributions that are close to exponential by forming mixtures of distributions that are exponential for \((S_k, \cdot)\) over \(k \in \N\).

Suppose that \(X\) takes values in \(S\) with probability density function \(f\) given by \[ f(x) = \begin{cases} p_0 \beta_0 (1 - \beta_0)^{\#(x)}, &\quad x \in S_0 \\ p_k \beta_k^{k + 1} (1 - \beta_k)^{\#(x) - 1}, &\quad x \in S_k, \, k \in \N_+ \end{cases} \] where \(\beta_k, \, p_k \in (0, 1)\) for each \(k \in \N\) and \(\sum_{k=0}^\infty p_k = 1\). Then the conditional distribution of \(X\) given \(X \in S_k\) is exponential on \((S_k, \cdot)\), with parameter \(\beta_k\), for each \(k \in \N\).

The distribution of \(X\) is as close to exponential as possible, in the sense that \(X\) is essentially exponential on each of the sub-semigroups \(S_k\), and these semigroups partition \(S\). We can construct random variable \(X\) as a mixture in the usual way.

For \(k \in \N_+\), suppose that \(Y_k\) has the exponential distribution on \((S_k, \cdot)\) with parameter \(\beta_k \in (0, 1)\). Next, suppose that \(K\) takes values in \(\N\) with probability density function given by \(\P(K = k) = p_k\) for \(k \in \N\), and that \(K\) is independent of \(\bs{Y} = (Y_0, Y_1, \ldots)\). Then \(X = Y_K\) has the distribution in proposition .

There is not much that we can say about the general distribution defined in . In the remainder of this section we will study a special case with particularly nice properties. For our first construction, we define a random variable \(X\) on \(S\) by first selecting a geometrically distributed population size \(N\), and then selecting a sample from \(\{1, 2, \ldots, N\}\) in an independent, identically distributed fashion. Of course, geometric distributions are the exponential distributions for the positive semigroup \((\N, +)\). Here is the precise construction:

Let \(N\) be a random variable taking values in \(\N\), and having the geometric distribution with success parameter \(1 - r\) where \(r \in (0, 1)\): \[\P(N = n) = (1 - r) r^n, \quad n \in \N\] Given \(N = n \in \N_+\), random variable \(X\) is distributed on the subsets of \(\{1, 2, \ldots, n\}\) so that \(i \in X\), independently, with probability \(p\) for each \(i \in \{1, 2, \ldots, n\}\). If \(N = 0\), then \(X = \emptyset\).

Random variable \(X\) has probability density function \(f\) and reliability function \(F\) for \((S, \subseteq)\) given by

\[ f(x) = \frac{1 - r}{1 - r(1 - p)} p^{\#(x)} (1 - p)^{\max(x) - \#(x)} r^{\max(x)}, \quad x \in S \]
\[ F(x) = p^{\#(x)} r^{\max(x)}, \quad x \in S \]

Details:

For \(x \in S\), \[\P(X = x) = \sum_{n = 0}^\infty \P(N = n) \P(X = x \mid N = n)\] If \(n \lt \max(x)\) then \(x\) is not a subset of \(\{1, 2, \ldots, n\}\), so \(\P(X = x \mid N = n) = 0\). If \(n \ge \max(x)\) then \(x\) is a subset of \(\{1, 2, \ldots, n\}\) and by assumption, \(\P(X = x \mid N = n) = p^{\#(x)} (1 - p)^{n - \#(x)}\). Substituting gives \[\P(X = x) = \sum_{n = \max(x)}^\infty (1 - r) r^n p^{\#(x)}(1 - p)^{n - \#(x)}\] which simplifies to (a). By a similar argument, \[\P(X \supseteq x) = \sum_{n = \max(x)}^\infty (1 - r) r^n p^{\#(x)}\] which simplifies to (b).

Not surprisingly, the distribution of \(X\) depends on \(x \in S\) only through \(\#(x)\) and \(\max(x) - \#(x)\), so let \(U = \#(X)\) and \(V = \max(X) - \#(X)\). The following corollaries will explore the relationships between the distributions of \(U\), \(V\), and \(X\), and provide another way of constructing the distribution of \(X\). Recall that \(S_{n, k} = \{x \in S_k: \#(x) = n\}\) for \(n = k = 0\) or \(n \in \N_+, \, k \in \N\).

The joint probability density function of \((U, V)\) is given by \begin{align*} \P(U = 0, V = 0) &= \frac{1 - r}{1 - r(1 - p)} \\ \P(U = n, V = k) &= \frac{1 - r}{1 - r(1 - p)} \binom{n + k - 1}{n - 1} (r p)^n [r (1 - p)]^k, \quad n \in \N_+, \, k \in \N \end{align*}

Details:

Note that \(\{U = n, \, V = k\} = \{X \in S_{n, k}\}\). Recall that \(S_{0, 0} = \{\emptyset\}\) and \(\#(S_{n, k}) = \binom{n + k - 1}{n - 1}\) if \(n \in \N_+, \, k \in \N\). So the result follows from .

For \(n = k = 0\) or \(n \in \N_+, \, k \in \N\), the conditional distribution of \(X\) given \(U = n, \, V = k\) is uniform on \(S_{n, k}\).

Details:

First the trivial case: \(\P(X = \emptyset \mid U = 0, V = 0) = 1\). If \(n \in \N_+, \, k \in \N\) then from the density functions of \(X\) and \((U, V)\) given in and \[\P(X = x \mid U = n, V = k) = \frac{\P(X = x)}{\P(U = n, V = k)} = \frac{1}{\binom{n + k - 1}{n - 1}}, \quad x \in S_{n, k}\]

\(U\) has a geometric distribution on \(\N\) with success parameter \((1 - r) / [1 - r (1 - p)]\): \[\P(U = n) = \frac{1 - r}{1 - r (1 - p)} \left[\frac{r p}{1 - r (1 - p)}\right]^n, \quad n \in \N \] \[\E(U) = \frac{r p}{1 - r}, \quad \var(U) = \frac{r p [1 - r (1 - p)]}{(1 - r)^2}\]

Details:

First, \[\P(U = 0) = \P(U = 0, V = 0) = \frac{1 - r}{1 - r (1 - p)}\] Next using the general binomial theorem and Proposition \begin{align*} \P(U = n) & = \sum_{k = 0}^\infty \P(U = n, V = k) = \frac{1 - r}{1 - r (1 - p)}(r p)^n \sum_{k = 0}^\infty \binom{n + k - 1}{k} [r (1 - p)]^k \\ & = \frac{1 - r}{1 - r (1 - p)}\left[\frac{r p}{1 - r (1 - p)}\right]^n, \quad n \in \N_+ \end{align*}

The formulas for \(\E(U)\) and \(\var(U)\) are standard results for the geometric distribution.

For \(n \in \N_+\), the conditional distribution of \(V\) given \(U = n\) is negative binomial on \(\N\) with stopping parameter \(n\) and success parameter \(1 - r (1 - p)\): \[\P(V = k \mid U = n) = \binom{n + k - 1}{n - 1} [r (1 - p)]^k [1 - r (1 - p)]^n, \quad k \in \N\] \[\E(V \mid U = n) = n \frac{r (1 - p)}{1 - r (1 - p)}, \quad \var(V \mid U = n) = n \frac{r (1 - p)}{[1 - r (1 - p)]^2}\]

Details:

This follows easily from the joint density of \((U, V)\) and the density of \(U\) given in and . For \(n \in \N_+\), \[\P(V = k \mid U = n) = \frac{\P(U = n, V = k)}{\P(U = n)}, \quad k \in \N\] Of course also, \(P(V = 0 \mid U = 0) = 1\).

\(V\) has a modified geometric distribution on \(\N\): \begin{align*} \P(V = 0) & = \frac{1 - r}{[1 - r (1 - p)](1 - r p)} \\ \P(V = k) & = \frac{(1 - r) r p}{[1 - r (1 - p)](1 - r p)} \left[\frac{r (1 - p)}{1 - r p}\right]^k, \quad k \in \N_+ \end{align*} \[\E(V) = \frac{r^2 p (1 - p)}{(1 - r)[1 - r (1 - p)]}, \quad \var(V) = \frac{p (1 - p) r^2\left[1 - r^2 - p^2 r^2 - p r (1 - 2 r) \right]}{(1 - r)^2 [1 - r (1 - p)]^2}\]

Details:

This follows from the joint density function of \((U, V)\) in \[\P(V = 0) = \sum_{n=0}^\infty \P(U = n, V = 0)\] and \[\P(V = k) = \sum_{n=1}^\infty \P(U = n, V = k), \quad k \in \N_+\] The moment results can also be obtained by conditioning: \[\E(V) = \E[\E(V \mid U)], \quad \var(V) = \var[\E(V \mid U)] + \E[\var(V \mid U)]\]

Covariance and correlation of \((U, V)\)

\[\cov(U, V) = \frac{r^2 p (1 - p)}{(1 - r)^2} \]
\[\cor(U, V) = \sqrt{\frac{r p (1 - p)[1 - r (1 - p)]}{1 - r^2 - p^2 r^2 - p r (1 - 2 r)}}\]
\(\cor(U, V) \to 1\) as \(r \to 1\) for fixed \(p \in (0, 1)\)
\(\cor(U, V) \to 0\) as \(p \to 1\) for fixed \(r \in (0, 1)\).
\(\cor(U, V) \to 0\) as \(r \to 0\) for fixed \(p \in (0, 1)\).
\(\cor(U, V) \to 0\) as \(p \to 0\) for fixed \(r \in (0, 1)\).

Details:

Using the conditional mean of \(V\) given \(U\) we have \[\E(U V) = \E[\E(U V \mid U)] = \E[U \E(V \mid U)] = \frac{r (1 - p)}{1 - r (1 - p)} \E(U^2)\] But of course \(\E(U^2) = \var(U) + [\E(U)]^2\) and \(\cov(U, V) = \E(U V) - \E(U) \E(V)\) so the covariance result follows from substituting the moments of \(U\) and \(V\) and simplifying.

Open the simulation of the almost exponential distribution on \((S, \cdot)\). The parameters \(r\) and \(p\) can be varied with scrollbars. Then probability density functions of \(U\) and \(V\) are shown in the distribution plots and tables. Run the simulation for various values of the parameters and note the subsets generated. Compare the empirical density functions to the probability density functions.

Of course, propositions , and determine the distribution of \(X\). In fact these results give an alternate way of constructing the distribution of \(X\) in the first place:

Consider the random variable \(X\) on \(S\) constructed as follows:

Random variable \(U\) has the geometric distribution on \(\N\) with success parameter \(a \in (0, 1)\).
Given \(U = 0\) set \(V = 0\) but otherwise given \(U = n \in \N_+\) random variable \(V\) has the negative binomial distribution on \(\N\) with stopping parameter \(n\) and success parameter \(b \in (0, 1)\).
Given \(U = n, V = k\) (so either \(n = k = 0\), or \(n \in \N_+\) and \(k \in \N\)), random variable \(X\) is uniformly distributed on \(S_{n, k}\).

Then \(X\) has the almost exponential distribution on \(S\) in with parameters \[ r = a (1 - b) + b, \quad p = \frac{a(1 - b)}{a (1 - b) + b} \]

Our original construction, although simple, is perhaps unsatisfactory because the population variable \(N\) is hidden (not directly observable from \(X\)). The alternate construction has no hidden variables, and moreover, the geometric distribution of \(U\) and the conditional uniform distribution for \(X\) given \(U = n, V = k\) are natural. On the other hand, the conditional negative binomial distribution of \(V\) given \(U = n\) is somewhat obscure.

Our next goal is to study the distribution of the random subset \(X\) on the sub-semigroups \((S_k, \cdot)\).

Suppose that \(X\) has the almost exponential distribution with parameters \(r, \, p \in (0, 1)\). The rate function of \(X\) on \((S, \subseteq)\) is given by \[\frac{\P(X = x)}{\P(X \supseteq x)} = \frac{1 - r} {1 - r (1 - p)} (1 - p) ^ {\max(x) - \#(x)}, \quad x \in S\]

Thus, for \(k \in \N\), random variable \(X\) has constant rate \(\frac{1 - r}{1 - r (1 - p)} (1 - p)^k\) for \((S, \subseteq)\) on the subset \(S_k\). The rate decreases to 0 exponentially fast as \(\max(x) - \#(x)\) increases to \(\infty\). As a corollary, we have the following result:

For \(x \in S_0\), \begin{align*} \P(X = x) &= \frac{1 - r}{1 - r + r p}(r p)^{\#(x)} \\ \P(X \supseteq x) &= (r p)^{\#(x)} \end{align*} So \(X\) has the memoryless property on \(S_0\), in addition to the constant rate property.

Details:

That is, for \(x, y \in S_0\), \begin{align*} \P(X \supseteq x y) &= (r p)^{\#(x y)} = (r p)^{\#(x) + \#(y)} \\ &= (r p)^{\#(x)} (r p)^{\#(y)} = \P(X \supseteq x) \P(X \supseteq y) \end{align*}

Next we find the conditional distribution of \(X\) given \(X \in S_k\) for \(k \in \N\).

The conditional probability density functions of \(X\) give \(X \in S_k\) are as follows: \begin{align} \P(X = x \mid X \in S_0) &= (1 - r p) (r p)^{\#(x)}, \quad x \in S_0 \\ \P(X = x \mid X \in S_k) &= (1 - r p)^{k + 1}(r p)^{\#(x) - 1}, \quad x \in S_k, k \in \N_+ \end{align}

Details:

This follows from the probability density functions of \(X\) and \(V\) given in propositions and For \(k \in \N\), \[\P(X = x \mid X \in S_k) = \frac{\P(X = x)}{\P(X \in S_k)} = \frac{\P(X = x)}{\P(V = k)}, \quad x \in S_k\]

Thus, \(X\) has an almost exponential distribution on \((S, \cdot)\) in the sense of the original definition , with parameters \(\beta_k = rp\) for \(k \in \N\), and with the mixing probabilities given by the probability density function of \(V\) in .

Recall that no exponential distribution on \(S\) exists because otherwise the events \(\{\{i \in X\}: i \in \N_+\}\) would be independent with a common probability. The next resut explores these events for a random variable with an almost exponential distribution.

Suppose that \(X\) has the almost exponential distribution on \((S, \cdot)\) with parameters \(r, \, p \in (0, 1)\).

\(\P(i \in X) = p r^i\) for \(i \in \N_+\).
If \(i_1, i_2, \ldots, i_n \in \N_+\) with \(i_1 \lt i_2 \lt \cdots \lt i_n\) then \[\P(i_n \in X \mid i_1 \in X, \ldots, i_{n-1} \in X) = \P(i_n \in X \mid i_{n - 1} \in X) =\P(i_n - i_{n - 1} \in X) = p r^{i_n - i_{n - 1}}\]
For \(j \in \N_+\), the events \(\{1 \in X\}, \{2 \in X\}, \ldots, \{j-1 \in X\}\) are conditionally independent given \(\{j \in X\}\) with \(\P(i \in X \mid j \in X) = p\) for \(i \in \{1, 2, \ldots, j - 1\}\).

Details:

These results follow from the reliability function of \(X\) for \((S, \subseteq)\) given in .

\(\P(i \in X) = \P(\{i\} \subseteq X) = p r^i\) for \(i \in \N_+\).
For \(i_1, \, i_2, \ldots, i_n \in \N_+\) with \(i_1 \lt i_2 \lt \cdots \lt i_n\), \[\P(i_n \in X \mid i_1 \in X, \ldots, i_{n - 1} \in X) = \frac{\P(\{i_1, \ldots, i_n\} \subseteq X)}{\P(\{i_1, \ldots, i_{n - 1}\} \subseteq X)} = \frac{p^n r^{i_n}}{p^{n - 1} r^{i_{n - 1}}} = p r^{i_n - i_{n - 1}}\] Similarly, \[\P(i_n \in X \mid i_{n - 1} \in X) = \frac{\P(\{i_{n - 1}, i_n\} \subseteq X)}{\P(\{i_{n - 1}\} \subseteq X)} = \frac{p^2 r^{i_n}}{p r^{i_{n - 1}}} = p r^{i_n - i_{n - 1}}\] and the common values is \(\P(i_n - i_{n-1} \in X)\) by part (a).
First note that if \(i \in \{1, 2, \ldots, j - 1\}\) then \[\P(i \in X \mid j \in X) = \frac{\P(\{i, j\} \subseteq X)}{\P(\{j\} \subseteq X)} = \frac{p^2 r^j}{p r^j} = p\] Next, if \(A \subseteq \{1, 2, \ldots, j - 1\}\) with \(\#(A) = n\) then \[\P(A \subseteq X \mid j \in X) = \frac{\P(A \cup \{j\} \subseteq X)}{\P(\{j\} \subseteq X)} = \frac{p^{n + 1} r^j}{p r^j} = p^n\]

Property (a) is clearly a result of the original construction of \(X\). Property (b) is reminiscent of a Markov property. This property implies that the events \(\{\{i \in X\}: i \in \N_+\}\) are positively correlated, but asymptotically uncorrelated. In fact the correlation decays exponentially since \[\P(i + j \in X \mid i \in X) = \P(j \in X) = p r^{j} \to 0 \text{ as } j \to \infty\]

Characterize all random subsets of \(\N_+\) that satisfy the partial Markov property in part (b) of .

Suppose that \(X\) has the almost exponential distribution on \((S, \cdot)\) with parameters \(p, \, r \in (0, 1)\). Then \(X\) maximizes entropy among all random variables \(Y\) on \(S\) with \begin{align*} \E[\#(Y)] & = \E[\#(X)] = \frac{r p}{1 - r} \\ \E[\max(Y)] & = \E[\max(X)] = \frac{r p}{(1 - r)[1 - r (1 - p)]} \end{align*}

Suppose that \(X\) is a random variable in \(S\). The hitting probability function \(G\) is defined by \[G(x) = \P(X \cap x \ne \emptyset), \quad x \subseteq \N_+\]

The hitting probability function is fundamental importance in the general theory of random sets (see the book by Matheron) since it completely determines the distribution of \(X\). Note that \(G\) is defined for all subsets of \(\N_+\), not just finite subsets.

Suppose that \(X\) has the almost exponential distribution on \((S, \cdot)\) with parameters \(p, \, r \in (0, 1)\). Then \[G(x) = \sum_{i = 1}^{\#(x)} p (1 - p)^{i - 1} r^{x(i)}, \quad x \subseteq \N_+\] where as usual, \(x(i)\) is the \(i\)th smallest element of \(x\).

Details:

Suppose first that \(x\) is finite (so that \(x \in S\)). From the standard inclusion-exclusion formula (or fromMatheron), \[G(x) = \sum_{k = 1}^{\#(x)} (-1)^{k - 1} \sum_{y \subseteq x, \#(y) = k} F(y)\] where \(F\) is the reliability function of \(X\) for \((S, \subseteq)\). Hence, using we have. \begin{align*} G(x) &= \sum_{k = 1}^{\#(x)} (-1)^{k - 1} \sum_{y \subseteq x, \#(y) = k} p^{\#(y)} r^{\max(y)} \\ &= \sum_{k = 1}^{\#(x)} (-1)^{k - 1} p^k \sum_{i = k}^{\#(x)} \sum \{r^{x(i)}: y \subseteq x, \#(y) = k, \max(y) = x(i)\} \\ &= \sum_{k = 1}^{\#(x)} (-1)^{k - 1} p^k \sum_{i = k}^{\#(x)} \binom{i - 1}{k - 1} r^{x(i)} = \sum_{i = 1}^{\#(x)} r^{x(i)} \sum_{k = 1}^i \binom{i - 1}{k - 1} (-1)^{k - 1} p^k \\ &= \sum_{i = 1}^{\#(x)} p (1 - p)^{i - 1} r^{x(i)} \end{align*} For infinite \(x\), the formula holds by the continuity theorem.

Suppose again that \(X\) has the almost exponential distribution on \((S, \cdot)\) with parameter \(r, \, p \in (0, 1)\). Then \[ G(\{n, n + 1, \ldots, n + k - 1\}) = \frac{p r^n}{1 - r + r p}\left(1 - [r (1 - p)]^k \right), \quad n, \, k \in \N_+ \]

Letting \(k = 1\) in (a) gives \(G(\{n\}) = \P(n \in X) = p r^n\) for \(n \in \N_+\) which agrees with .
Letting \(k \to \infty\) in (a) we have \[ G(\{n, n + 1, \ldots\}) = \P[\max(X) \ge n] = \frac{p r^n}{1 - r + rp}, \quad n \in \N_+ \] which agrees with
Letting \(n = 1\) in (b) gives \[G(\N_+) = 1 - \P(X \ne \emptyset) = \frac{rp}{1 - r + r p}\] which agrees with .

Suppose again that \(X\) has the almost exponential distribution on \((S, \cdot)\) with parameters \(r, \, p \in (0, 1)\). Find the probability that

\(X\) contains an even integer.
\(X\) contains an odd integer.
\(X\) contains an even integer and an odd integer.

Details:

Let \(E\) denote the set of even positive intgers and \(O\) the set of odd positive integers

\[G(E) = \sum_{i = 1}^\infty p (1 - p)^{i - 1} r^{2 i} = \frac{p r^2}{1 - r^2 + p r^2}\]
\[G(O) = \sum_{i = 1}^\infty p (1 - p)^{i - 1} r^{2 i - 1} = \frac{r p}{1 - r^2 + p r^2}\]
\[\P(X \cap E \ne \emptyset, X \cap O \ne \emptyset) = G(E) + G(O) - G(\N_+) = \frac{p r^2 + p r}{1 - r^2 + p r^2} - \frac{p r}{1 - r + p r}\]

Suppose that \(X\) has the almost exponential distribution with parameters \(r, \, p \in (0, 1)\) and that \(\bs X = (X_1, X_2, \ldots)\) is a sequence of independent copies of \(X\). Let \(Y_n = X_1 X_2 \cdots X_n\) for \(n \in \N_+\) so that \(\bs Y = (Y_1, Y_2, \ldots)\) is the random walk on \((S, \cdot)\) corresponding to \(X\). Find the density function \(f_n\) of \(Y_n\).

Our last discussion concerns the distribution that is almost exponential for the semigroup \((S, \cdot)\) but now relative to the covering graph \((S, \upa)\). As usual, let \(v(x) = \max(x) - \#(x)\) for \(x \in S\).

Suppose again that \(X\) has the almost exponential distribution on \((S, \cdot)\) with parameters \(r, \, p \in (0, 1)\). The reliability function \(H\) of \(X\) for \((S, \upa)\) is given by \[H(x) = \frac{1 - r}{1 - r (1 - p)}(r p)^{\#(x) + 1} [r (1 - p)]^{v(x) - 1} \left[v(x) + \frac{r (1 - p)}{1 - r (1 - p)} \right], \quad x \in S \]

Details:

As usual, let \(f\) denote the density function of the almost exponential distribution. Since \(\upa\) is the covering relation and since \(i \gt \max(x)\) implies \(i \notin x\), we have \[H(x) = \sum_{i \notin x} f(x \cup \{i\}) = \sum_{i \notin x, i \lt \max(x)} f(x \cup \{i\}) + \sum_{i = \max(x) + 1}^\infty f(x \cup \{i\})\] If \(i \notin x\) then \(\#(x \cup \{i\}) = \#(x) + 1\). If in addition \(i \lt \max(x)\) then \(\max(x \cup \{i\}) = \max(x)\) so \(v(x \cup \{i\}) = v(x) - 1\). Moreover, the number of terms in the first sum on the right is \(v(x)\) and so the first sum is \[v(x) \frac{1 - r}{1 - r(1 - p)}(r p)^{\#(x) + 1} [r (1 - p)]^{v(x) - 1}\] On the other hand, if \(i \gt \max(x)\) then \(\max(x \cup \{i\}) = i\) and so \(v(x \cup \{i\}) = i - \#(x) - 1\). The second sum on the right is \[\sum_{i = \max(x) + 1}^\infty \frac{1 - r}{1 - r (1 - p)} (r p)^{\#(x) + 1} [r (1 - p)]^{i - \#(x) - 1} = \frac{1 - r}{[1 - r (1 - p)]^2} (r p)^{\#(x) + 1} [r (1 - p)]^{v(x)}\] simplifying gives the result.

We can rewrite the result as \[H(x) = f(x) r p \left[\frac{v(x)}{r (1 - p)} + \frac{1}{1 - r(1 - p)}\right], \quad x \in S\] so in particular, on \(S_k\) the rate function for \((S, \upa)\) is the constant \[\frac{r(1 - p)[1 - r(1 - p)]}{r p [k - (k - 1) r (1 - p)]}\]