This section explores uniform distributions in an abstract setting. If you are a new student of probability, or are not familiar with measure theory, you may want to skip this section and read the sections on the uniform distribution on an interval and the discrete uniform distributions.
Suppose that \( (S, \ms S, \lambda) \) is a measure space. That is, \( S \) is a set, \( \ms S \) a \( \sigma \)-algebra of subsets of \( S \), and \( \lambda \) a positive measure on \( \ms S \). Suppose also that \( 0 \lt \lambda(S) \lt \infty \), so that \( \lambda \) is a finite, positive measure.
Random variable \( X \) with values in \( S \) has the uniform distribution on \( S \) (with respect to \( \lambda \)) if \[ \P(X \in A) = \frac{\lambda(A)}{\lambda(S)}, \quad A \in \ms S \]
Thus, the probability assigned to a set \( A \in \ms S\) depends only on the size of \( A \) (as measured by \( \lambda \)).
The most common special cases are as follows:
In the Euclidean case, recall that \( \lambda_1 \) is length measure on \( \R \), \( \lambda_2 \) is area measure on \( \R^2 \), \( \lambda_3 \) is volume measure on \( \R^3 \), and in general \( \lambda^n \) is sometimes referred to as \( n \)-dimensional volume. Thus, \( S \in \ms R_n \) is a set with positive, finite volume.
Suppose \((S, \ms S, \lambda)\) is a finite, positive measure space, as above, and that \( X \) is uniformly distributed on \( S \).
The probability density function \( f \) of \( X \) (with respect to \( \lambda \)) is \[ f(x) = \frac{1}{\lambda(S)}, \quad x \in S \]
This follows directly from the definition of probability density function: \[\int_A \frac 1 {\lambda(S)} \, d\lambda(x) = \frac{\lambda(A)}{\lambda(S)}, \quad A \in \ms S\]
Thus, the defining property of the uniform distribution on a set is constant density on that set. Another basic property is that uniform distributions are preserved under conditioning.
Suppose that \( R \in \ms S \) with \( \lambda(R) \gt 0 \). The conditional distribution of \( X \) given \( X \in R \) is uniform on \( R \).
For \(A \in \ms S\) with \( A \subseteq R \), \[ \P(X \in A \mid X \in R) = \frac{\P(X \in A)}{\P(X \in R)} = \frac{\lambda(A)/\lambda(S)}{\lambda(R)/\lambda(S)} = \frac{\lambda(A)}{\lambda(R)} \]
In the setting of , suppose that \( \bs{X} = (X_1, X_2, \ldots) \) is a sequence of independent variables, each uniformly distributed on \( S \). Let \( N = \min\{n \in \N_+: X_n \in R\} \). Then \( N \) has the geometric distribution on \( \N_+ \) with success parameter \( p = \P(X \in R) \). More importantly, the distribution of \( X_N \) is the same as the conditional distribution of \( X \) given \( X \in R \), and hence is uniform on \( R \). This is the basis of the rejection method of simulation. If we can simulate a uniform distribution on \( S \), then we can simulate a uniform distribution on \( R \).
If \( h \) is a real-valued function on \( S \), then \( \E[h(X)] \) is the average value of \( h \) on \( S \), as measured by \( \lambda \):
If \( h: S \to \R \) is integrable with respect to \( \lambda \) Then \[ \E[h(X)] = \frac{1}{\lambda(S)} \int_S h(x) \, d\lambda(x) \]
This result follows from the change of variables theorem for expected value, since \[ \E[h(X)] = \int_S h(x) f(x) \, d\lambda(x) = \frac 1 {\lambda(S)} \int_S h(x) \, d\lambda(x)\]
The entropy of the uniform distribution on \( S \) depends only on the size of \( S \), as measured by \( \lambda \):
The entropy of \( X \) is \( H(X) = \ln[\lambda(S)] \).
One last trivial note: every probability distribution is uniform with respect to itself.
Suppose that \((S, \ms S)\) is a measurable space and that \(X\) is a random variable defined on a probability space \((\Omega, \ms F, \P)\) with values in \(S\). Recall that the probability distribution of \(X\) is the probability measure \(P\) on \((S, \ms S)\) given by \(P(A) = \P(X \in A)\) for \(A \in \ms S\). Random variable \(X\) is uniformly distributed on \((S, \ms S, P)\) since \[ \P(X \in A) = \frac{P(A)}{P(S)}, \quad A \in \ms S\]
So uniform distributions are mainly of interest when there is a fixed, reference measure space in the background, such as the spaces in .
Suppose now that \( (S, \ms S, \lambda) \) and \( (T, \ms T, \mu) \) are finite, positive measure spaces, so that \( 0 \lt \lambda(S) \lt \infty \) and \( 0 \lt \mu(T) \lt \infty \). Recall the product space is denoted \( (S \times T, \ms S \times \ms T, \lambda \times \mu) \). The product \( \sigma \)-algebra \( \ms S \times \ms T \) is the \( \sigma \)-algebra of subsets of \( S \times T \) generated by product sets \( A \times B \) where \( A \in \ms S \) and \( B \in \ms T \). The product measure \( \lambda \times \mu \) is the unique positive measure on \( (S \times T, \ms S \times \ms T) \) that satisfies \( (\lambda \times \mu)(A \times B) = \lambda(A) \mu(B) \) for \( A \in \ms S \) and \( B \in \ms T \).
\( (X, Y) \) is uniformly distributed on \( S \times T \) if and only if \( X \) is uniformly distributed on \( S \), \( Y \) is uniformly distributed on \( T \), and \( X \) and \( Y \) are independent.
Suppose first that \( (X, Y) \) is uniformly distributed on \( S \times T\). If \( A \in \ms S \) and \( B \in \ms T \) then \[ \P(X \in A, Y \in B) = \P[(X, Y) \in A \times B] = \frac{(\lambda \times \mu)(A \times B)}{(\lambda \times \mu)(S \times T)} = \frac{\lambda(A) \mu(B)}{\lambda(S) \mu(T)} = \frac{\lambda(A)}{\lambda(S)} \frac{\mu(B)}{\mu(T)} \] Taking \( B = T \) in the displayed equation gives \( \P(X \in A) = \lambda(A) \big/ \lambda(S) \) for \( A \in \ms S \), so \( X \) is uniformly distributed on \( S \). Taking \( A = S \) in the displayed equation gives \( \P(Y \in B) = \mu(B) \big/ \mu(T) \) for \( B \in \ms T \), so \( Y \) is uniformly distributed on \( T \). Returning to the displayed equation generally gives \( \P(X \in A, Y \in B) = \P(X \in A) \P(Y \in B) \) for \( A \in \ms S \) and \( B \in \ms T \), so \( X \) and \( Y \) are independent.
Conversely, suppose that \( X \) is uniformly distributed on \( S \), \( Y \) is uniformly distributed on \( T \), and \( X \) and \( Y \) are independent. Then for \( A \in \ms S \) and \( B \in \ms T \), \[ \P[(X, Y) \in A \times B] = \P(X \in A, Y \in B) = \P(X \in A) \P(Y \in B) = \frac{\lambda(A)}{\lambda(S)} \frac{\mu(B)}{\mu(T)} = \frac{\lambda(A) \mu(B)}{\lambda(S) \mu(T)} = \frac{(\lambda \times \mu)(A \times B)}{(\lambda \times \mu)(S \times T)} \] It then follows (see the section on existence and uniqueness of measures) that \( \P[(X, Y) \in C] = (\lambda \times \mu)(C) / (\lambda \times \mu)(S \times T) \) for every \( C \in \ms S \times \ms T \), so \( (X, Y) \) is uniformly distributed on \( S \times T \).