Probability on Semigroups

Our starting point in this section is a measurable space \((S, \ms S)\) and a measurable semigroup \((S, \cdot)\) as discussed in Section 1. Recall that the relation \(\rta_A\) associated with \((S, \cdot)\) and a set \(A \in \ms S\) is defined by \(x \rta_A y\) if and only if \(y \in x A\). We are mostly interested in the case where \(A = S\), in which case we drop the reference to \(A\), so that \((S, \rta)\) is the graph associated with the semigroup \((S, \cdot)\). We assume that we have a probability space \((\Omega, \ms F, \P)\) in the background, so that random variables in \(S\) are measurable functions from \(\Omega\) into \(S\). All of the definitions and results in Chapter 1 apply to the graph \((S, \rta)\).

Reliability functions

The reliability function of a random variable for a semigroup is simply the reliability function of the random variable for the associated graph.

If \(X\) is a random variable in \(S\) then the reliability function \(F\) of \(X\) for \((S, \cdot)\) is given by \[F(x) = \P(x \rta X) = \P(X \in x S), \quad x \in S\]

So the reliability function is the function that assigns probabilities to the set of right neighobors \(x S\) of \(x \in S\), and just as for graphs, this is the reason for the importance of right neighbor sets. The reliability function makes sense if \(X\) is only measurable with respect to \(\ms S_0 = \sigma(\{x S: x \in S\})\), the \(\sigma\)-algebra associated with the semigroup, rather than the larger reference \(\sigma\)-algebra \(\ms S\).

The semigroup \((S, \cdot)\) is (right) stochastic if the corresponging graph \((S, \rta)\) is stochastic. That is, if \(P\) and \(Q\) are probability measures on \(\ms S_0\) with the same reliability function \(F\), then \(P = Q\).

So for a stochastic semigroup, the reliability function uniquely determines the underlying probability measure on the associated \(\sigma\)-algebra, and once again, this is purely a graph-theoretic concept. Of course, the other basic results from Section 1.4 apply.

Density Functions

Suppose now that \(\lambda\) is a fixed, \(\sigma\)-finite reference measure on \((S, \ms S)\) that is left invariant for \((S, \cdot)\). As usual, density functions are with respect to \(\lambda\). We start with another basic definition inherited from the underlying graph.

Suppose that \(X\) is a random variable in \(S\) with denisty function \(f\) and reliability function \(F\) for \((S, \cdot)\). The rate function \(r\) of \(X\) for \((S, \cdot)\) is given by \(r(x) = f(x) / F(x)\) for \(x \in \S\).

Details:

We have our usual support assumption that \(F(x) \gt 0\) for \(x \in S\).

The following simple proposition and its corollaries concern the distribution of products.

Suppose that \(Y\) is a random variable in \(S\) with density function \(g\). For \(x \in S\), random variable \(x Y\) has density function \(z \mapsto g(x^{-1} z)\) on \(x S\).

Details:

Clearly \(x Y\) takes values in \(x S\). Let \(A \in \ms S\) with \(A \subseteq x S\). Then by the integral versions of left invariance in Section 3, \[\P(x Y \in A) = \P(Y \in x^{-1} A) = \int_{x^{-1} A} g(y) d\lambda(y) = \int_A g(x^{-1} z) d\lambda(z)\]

Suppose that \(X\) and \(Y\) are independent random variables in \(S\) with density functions \(f\) and \(g\) respectivley. Then \((X, XY)\) has density function \(h\) given by \[h(x, z) = f(x) g(x^{-1} z), \quad x \rta z\]

Details:

Clearly \((X, XY)\) takes values in \(\{(x, x y): (x, y) \in S^2\} = \{(x, z) \in S^2: x \rta z\}\) (which once again is the relation \(\rta\) as a subset of \(S^2\)). Moreover \(h(x, z)\) can be expressed as \(f(x)\) times the conditional density of \(XY\) at \(z\) given \(X = x\). But by independence, this is just the density of \(x Y\) at \(z\). So the result follows from .

Suppose again that \(X\) and \(Y\) are independent random variables in \(S\) and with density functions \(f\) and \(g\) respectively. Then \(X Y\) has density function \(f * g\), the convolution of \(f\) with \(g\): \[(f * g)(z) = \int_{x \rta z} f(x) g(x^{-1} z) d\lambda(z), \quad z \in S\]

Details:

This follows immediately from .

Suppose that \(\bs{X} = (X_1, X_2, \ldots)\) is a sequence of independent random variables in \(S\) and that \(X_i\) has density function \(f_i\) for \(i \in \N_+\). For \(n \in \N_+\) let \(Y_n = X_1 \cdots X_n\). Then \((Y_1, Y_2, \dots, Y_n)\) has density function \(h_n\) given by \[h_n(y_1, y_2, \ldots, y_n) = f_1(y_1) f_2(y_1^{-1} y_2) \cdots f_n(y_{n-1}^{-1} y_n), \quad y_1 \rta y_2 \rta \cdots \rta y_n\]

Details:

This follows by repeated application of proposition .

Random variable \(Y_n\) is the partial product of order \(n \in \N_+\) for the sequence \(\bs X\), and has density function \(f_1 * f_2 * \cdots * f_n\).

Sub-Semigroups

Suppose now that \((T, \cdot)\) is a (measurable) sub-semigroup of \((S, \cdot)\) and that \(X\) is a random variable in \(S\). We collect some simple facts about the conditional distribution of \(X\) given \(X \in T\). Let \(F\) denote the reliability function of \(X\) for \((S, \cdot)\) given in definition .

Suppose that \(\P(X \in T) \gt 0\).

Given \(X \in T\), the conditional reliability function \(F_T\) of \(X\) for \((T, \cdot)\) is defined by \[F_T(x) = \P(x \rta_T X \mid X \in T) = \P(X \in x T \mid X \in T) = \frac{\P(X \in x T)}{\P(X \in T)}, \quad x \in T\]
If \(X\) has density function \(f\) then the conditional density \(f_T\) of \(X\) given \(X \in T\) is defined by \[f_T(x) = \frac{f(x)}{\P(X \in T)}, \quad x \in T\]

Consider the special case where \((S, \cdot)\) is a discrete positive semigroup, with identity element \(e\). As usual, let \(S_+ = S - \{e\}\) so that \((S_+, \cdot)\) is a strictly positive, sub-semigroup of \((S, \cdot)\). As before, let \(F\) denote the reliability function of \(X\) for \((S, \cdot)\) and \(f\) the density function of \(X\).

Suppose that \(\P(X \in S_+) \gt 0\) or equivalently \(f(e) \lt 1\).

The conditional reliability function of \(X\) given \(X \in S_+\) for \((S_+, \cdot)\) is defined by \[F_+(x) = \frac{F(x) - f(e)}{1 - f(e)}, \quad x \in S_+\]
The conditional density \(f_+\) of \(X\) given \(X \in S_+\) is defined by \[f_+(x) = \frac{f(x)}{1 - f(e)}, \quad x \in S_+\]

We are restricting our attention to discrete positive semigroups because in the continuous case typically \(\P(X = e) = 0\) so the conditional distributions are the same as the unconditional distributions.

Infinitely Divisible and Compound Distributions

The concept of an infinitely divisible distribution makes sense in the semigroup setting.

Suppose that \(X\) is a random variable in \(S\). Then \(X\) has an infinitely divisible distribution on \((S, \cdot)\) if for every \(n \in \N_+\), there exists a random variable \(U_n\) in \(S\) such that \(X\) has the same distribution as the product of \(n\) indpendent copies of \(U_n\).

For the standard discrete semigroup \((\N, +)\) and the standard continuous semigroup \(([0, \infty), +)\), the term infintely divisible has its classical meaning. That is, for every \(n \in \N_+\), random variable \(X\) can be written as a sum of \(n\) independent, identically distributed variables. Random products of random variables also make sense in the semigroup setting, producing a type of compound distribution. We want to include the possibility of an empty product so we restrict our attention to positive semigroups, for which an empty product is interpreted as the identity element.

Suppoe that \((S, \cdot)\) is a positive semigroup and that \(V\) is a random variable in \(S\) and \(N\) is a random variable in \(\N\). Random variable \(X\) has a compound distribution on \((S, \cdot)\) corresponding to \(V\) and \(N\) if \(X\) has the same distribution as the product of \(N\) independent copies of \(V\).

That is, \(X\) has a compound distribution in this sense if \(X\) can be factored as a random number of independent, identically distributed variables, with the number of factors independent of the factors themselves. Compound distributions are often named for the distribution of the random number of factors \(N\). The most famous and important example is the compound Poisson distribution where \(N\) has a Poisson distribution on \(\N\). Another important example is the compound geometric distribution where \(N\) has a geometric distribution on \(\N\). Once again, for the standard discrete semigroup \((\N, +)\) and the standard continuous semigroup \(([0, \infty), +)\), compound distributions in this sense have their usual meanings. That is, \(X\) can be written as a sum of a random number of independent, identically distributed variables, with the number of terms independent of the terms themselves.

Suppose that \(X\) has a compound distribution on the positive semigroup \((S, \cdot)\) corresponding to random variables \(V\) in \(S\) and \(N\) in \(\N\), and that \(N\) has an infinitely divisible distribution on \((\N, +)\). Then \(X\) has an infinitely divisible distribution on \((S, \cdot)\).

Details:

Suppose that \(X = V_1 V_2 \cdots V_N\) where \(\bs{V} = (V_1, V_2, \ldots)\) is a sequence of independent copies of \(V\) and where \(N\) is independent of \(\bs V\). Since \(N\) is infinitely divisible on \((\N, +)\), for \(n \in \N_+\) we can assume \(N = N_1 + N_2 + \cdots + N_n\) where \((N_1, N_2, \ldots, N_n)\) is a sequence of independent copies of a random variable \(K\) in \(\N\) (and independent of \(\bs V\)). Substituting we have \(X = U_1 U_2 \cdots U_n\) where \[U_k = V_{M_{k - 1} + 1} \cdots V_{M_k} \text{ with } M_k = \sum_{i = 1}^k N_i, \quad k \in \{0, 1, \ldots, n\}\] Note that \(U_k\) has \(N_k\) terms. The sequence \((U_1, U_2, \ldots, U_n)\) is independent and each has the commpound distribution corresponding to \(V\) and \(K\).

In particular, this proposition applies to the compound Poisson distribution on \(\N\) and the compound geometric distribution on \(\N\) since both are infinitely divisible on \((\N, +)\).

Random Walks

Once again, we have a semigroup \((S, \cdot)\) with associated graph \((S, \rta)\). Recall that a random walk \(\bs Y = (Y_1, Y_2, \ldots)\) on the graph \((S, \rta)\) is a discrete time, homogeneous Markov process with the property that with probability 1, \(Y_n \rta Y_{n + 1}\), or equivalently \(Y_{n + 1} \in Y_n S\) for \(n \in \N_+\). Suppose now that \(X\) is a random variable supported by \((S, \cdot)\). The particular random walk \(\bs Y\) on \((S, \rta)\) associated with \(X\) has the properties that \(Y_1\) has the same distribution as \(X\), and given \(Y_n = x\) for \(n \in \N_+\) and \(x \in S\), the distribution of \(Y_{n + 1}\) is the same as the conditional distribution of \(X\) given \(x \rta X\) (equivlently \(X \in x S\)). But in the semigroup setting, there is another natural random walk associated with \(X\).

The random walk on \((S, \cdot)\) associated with \(X\) is the discrete time, homogeneous Markov process \(\bs Y = (Y_1, Y_2, \ldots)\) on \(S\) satisfying the following properties:

\(Y_1\) has the same distribution as \(X\).
For \(n \in \N_+\) and \(x \in S\), the conditional distribution of \(Y_{n + 1}\) given \(Y_n = x\) is the same as the distribution of \(x X\).

Note that \(\bs Y\) is also a random walk on the graph \((S, \rta)\). It's trivial to construct the random walk on the semigroup \((S, \cdot)\) associated with \(X\).

Let \(\bs X = (X_1, X_2, \ldots)\) be a sequence of indpendent copies of \(X\). Define \(Y_n = X_1 \cdots X_n\) for \(n \in \N_+\), so that \(\bs Y\) is the partial product sequence associated with \(\bs X\). Then \(\bs Y\) is the random walk on \((S, \cdot)\) associated with \(X\).

Details:

For \(n \in \N_+\) note that \(Y_{n + 1} = Y_n X_{n + 1}\) and that \(X_{n + 1}\) is independent of \((X_1, \ldots, X_n)\) and hence also \((Y_1, \ldots Y_n)\). So it's clear that \(\bs Y\) is a discrete-time Markov process. By definition, \(Y_1 = X_1\) has the same distribution as \(X\). Moreover, the conditional distribution of \(Y_{n+1}\) given \(Y_n = x \in S\) is the same as the distribution of \(x X_{n + 1}\), which is the same as the distribution of \(x X\).

Once again we assume that we have a fixed left-invariant measure \(\lambda\) for \((S, \cdot)\).

Suppose that \(X\) has density function \(f\). The random walk \(\bs Y = (Y_1, Y_2, \ldots)\) on \((S, \cdot)\) associated with \(X\) has transition density \(Q\) given by \[Q(x, y) = f(x^{-1} y), \quad x \in S, \, y \in x S\]

Details:

For \(x \in S\), the conditional density of \(Y_{n + 1}\) given \(Y_n = x\) is the same as the density of \(x X\), which by Proposition is \(y \mapsto f(x^{-1} y)\) on \(x S\).

Suppose again that \(X\) has density function \(f\) and has reliability function \(F\) for \((S, \rta)\). We now have two random walks associated with \(X\): one on the graph \((S, \rta)\) with transition density \(P\) given by \[P(x, y) = \frac{f(y)}{F(x)} \quad x \in S, \, y \in x S\] and the other on the semigroup \((S, \cdot)\) itself with transition density \(Q\) given by \[Q(x, y) = f(x^{-1} y), \quad x \in S, \, y \in x S\] The first can be constructed from a sequence \(\bs X = (X_1, X_2, \ldots)\) of independent copies of \(X\) via record variables while the second can be constructed from the sequence \(\bs X\) via partial products. When are these two random walks the same? The answer is given in Section 5. For the last results of this section, suppose again that \(X\) has density function \(f\) and that \(\bs Y = (Y_1, Y_2, \ldots)\) is the random walk on \((S, \cdot)\) associated with \(X\).

For \(n \in \N_+\),

\((Y_1, Y_2, \ldots, Y_n)\) has density function \(g_n\) given by \[g_n(x_1, x_2, \ldots, x_n) = f(x_1) f(x_1^{-1} x_2) \cdots f(x_{n-1}^{-1}, x_n), \quad x_1 \rta x_2 \rta \cdots \rta x_n\]
\(Y_n\) has density function \(f^{*n}\), the \(n\)-fold convolution power of \(f\).

Details:

The results follow immediately from propositions and .

Let \(\bs N = \{N_A: A \in \ms S\}\) denote the point process associated with \(\bs Y\). Then \[ \E(N_A) = \sum_{n = 1}^\infty \P(Y_n \in A) = \sum_{n = 1}^\infty \int_A f^{*n}(x) d\lambda(x) = \int_A\left[\sum_{n = 1}^\infty f^{*n}(x)\right] d\lambda(x)\]

If \(X\) has a compound Poisson distribution then \(Y_n\) has a compound Poisson distribution for \(n \in \N_+\), with the same compounded distribution.

Details:

For \(n \in \N_+\) we can write \(Y_n = \prod_{i = 1}^n X_i\) where \(\bs X = (X_1, X_2, \ldots)\) is a sequence of independent copies of \(X\). Since the distribution of \(X\) is compound Poisson, we can write \(X_i = \prod_{j = 1}^{N_i} V_{i, j}\) where \(\bs V_i = (V_{i, 1}, V_{i, 2}, \ldots)\) is a sequence of independent, identically distributed variables and where \(N_i\) is independent of \(\bs V_i\) and has the Poisson distribution with parameter \(\lambda_i \in (0, \infty)\). Since the sequence \(\bs X\) is independent and identically distributed, we can take the collection of random variables \(\{V_{i, j}: i \in \N_+, j \in \N_+\}\) to be independent and identically distributed, and \((N_1, N_2, \ldots)\) independent. By re-indexing the variables we can write \[Y_n = \prod_{i = 1}^n \prod_{j = 1}^{N_i} V_{i, j} = \prod_{k = 1}^N W_k\] where \(\bs W = (W_1, W_2, \ldots)\) is a sequence of independent variables, each with the common distribution \(V_{i, j}\) and where \(N = \sum_{i = 1}^n N_i\) has the Poisson distribution with parameter \(\sum_{i = 1}^n \lambda_i\).

Exercises

Consider the standard continuous semigroup \(([0, \infty), +)\) so that \(\le\) is the associated order. Suppose that \(X\) has density function \(f\) given by \[f(x) = \frac{a}{(x + 1)^{a + 1}}, \quad x \in [0, \infty)\] where \(a \in (0, \infty)\) is a parameter. So \(X\) has a version of the Pareto distribution with parameter \(a\). Find each of the following:

The reliability function \(F\) of \(X\) for \(([0, \infty), \le)\).
The rate function \(r\) of \(X\) for \(([0, \infty), \le)\).
The transition density \(P\) of the random walk on \(([0, \infty), \le)\) associated with \(X\).
The transition density \(Q\) of the random walk on \(([0, \infty), +)\) associated with \(X\).

Details:

\[F(x) = \frac{1}{(x + 1)^a}, \quad x \in [0, \infty)\]
\[r(x) = \frac{a}{x + 1}, \quad x \in [0, \infty)\]
\[P(x, y) = \frac{a (x + 1)^a}{(y + 1)^{a + 1}}, \quad 0 \le x \le y \lt \infty\]
\[Q(x, y) = \frac{a}{(y - x + 1)^{a + 1}}, \quad 0 \le x \le y \lt \infty\]

Consider the standard discrete semigroup \((\N, +)\) so that \(\le\) is the associated order. Suppose that \(X\) has density function \(f\) given by \[f(x) = \frac{1}{(x + 1)^\alpha} - \frac{1}{(x + 2)^\alpha}, \quad x \in \N\] where \(\alpha \in (0, \infty)\) is a parameter. Find each of the following:

The reliability function \(F\) of \(X\) for \((\N, \le)\).
The rate function \(r\) of \(X\) for \((\N, \le)\).
The transition density \(P\) of the random walk on \((\N, \le)\) associated with \(X\).
The transition density \(Q\) of the random walk on \((\N, +)\) associated with \(X\).

Details:

\[F(x) = \frac{1}{(x + 1)^\alpha}, \quad x \in \N\]
\[r(x) = 1 - \left(\frac{x + 1}{x + 2}\right)^\alpha, \quad x \in \N\]
\[P(x, y) = \left(\frac{x + 1}{y + 1}\right)^\alpha - \left(\frac{x + 1}{y + 2}\right)^\alpha, \quad x \le y\]
\[Q(x, y) = \frac{1}{(y - x + 1)^a} - \frac{1}{(y - x + 2)^a}, \quad x \le y\]

4. Probability on Semigroups

Basics

Reliability functions

Density Functions

Sub-Semigroups

Infinitely Divisible and Compound Distributions

Random Walks

Exercises