Absolute Continuity and Density Functions

Our starting point is a measurable space \( (S, \ms S) \) so that \( S \) is a set and \( \ms S \) is a \( \sigma \)-algebra of subsets of \( S \). In the last section, we discussed general measures on \( (S, \ms S) \) that can take positive and negative values. Special cases are positive measures, finite measures, and our favorite kind, probability measures. In particular, we studied properties of general measures, ways to construct them, special sets (positive, negative, and null), and the Hahn and Jordan decompositions.

In this section, we see how to construct a new measure from a given positive measure using a density function, and we answer the fundamental question of when a measure has a density function relative to the given positive measure.

Relations on Measures

The answer to the question involves two important relations on the collection of measures on \( (S, \ms S) \) that are defined in terms of null sets. Recall that \( A \in \ms S \) is null for a measure \( \mu \) on \( (S, \ms S) \) if \( \mu(B) = 0 \) for every \( B \in \ms S \) with \( B \subseteq A \). At the other extreme, \( A \in \ms S \) is a support set for \( \mu \) if \( A^c \) is a null set. Here are the basic definitions:

Suppose that \( \mu \) and \( \nu \) are measures on \( (S, \ms S) \).

\( \nu \) is absolutely continuous with respect to \( \mu \) if every null set of \( \mu \) is also a null set of \( \nu \). We write \( \nu \ll \mu \).
\( \mu \) and \( \nu \) are mutually singular if there exists \( A \in \ms S \) such that \( A \) is null for \( \mu \) and \( A^c \) is null for \( \nu \). We write \( \mu \perp \nu \).

So \( \nu \ll \mu \) if every support support set of \( \mu \) is a support set of \( \nu \). At the opposite end, \( \mu \perp \nu \) if \( \mu \) and \( \nu \) have disjoint support sets.

Suppose that \( \mu \), \( \nu \), and \( \rho \) are measures on \( (S, \ms S)\). Then

\( \mu \ll \mu \), the reflexive property.
If \( \mu \ll \nu \) and \( \nu \ll \rho \) then \( \mu \ll \rho \), the transitive property.

Recall that every relation that is reflexive and transitive leads to an equivalence relation, and then in turn, the original relation can be extended to a partial order on the collection of equivalence classes. This general theorem on relations leads to the following two results.

Measures \( \mu \) and \( \nu \) on \( (S, \ms S) \) are equivalent if \( \mu \ll \nu \) and \( \nu \ll \mu \), and we write \( \mu \equiv \nu \). The relation \(\equiv\) is an equivalence relation on the collection of measures on \((S, \ms S)\). That is, if \( \mu \), \( \nu \), and \( \rho \) are measures on \( (S, \ms S) \) then

\( \mu \equiv \mu \), the reflexive property
If \( \mu \equiv \nu \) then \( \nu \equiv \mu \), the symmetric property
If \( \mu \equiv \nu \) and \( \nu \equiv \rho \) then \( \mu \equiv \rho \), the transitive property

So \( \mu \) and \( \nu \) are equivalent if they have the same null sets and thus the same support sets. This equivalence relation is rather weak: equivalent measures have the same support sets, but the values assigned to these sets can be very different. As usual, we will write \( [\mu] \) for the equivalence class of a measure \( \mu \) on \( (S, \ms S) \), under the equivalence relation \( \equiv \).

If \( \mu \) and \( \nu \) are measures on \( (S, \ms S) \), we write \( [\mu] \preceq [\nu] \) if \( \mu \ll \nu \). The definition is consistent, and defines a partial order on the collection of equivalence classes. That is, if \( \mu \), \( \nu \), and \( \rho \) are measures on \( (S, \ms S) \) then

\( [\mu] \preceq [\mu] \), the reflexive property.
If \( [\mu] \preceq [\nu] \) and \( [\nu] \preceq [\mu] \) then \( [\mu] = [\nu] \), the antisymmetric property.
If \( [\mu] \preceq [\nu] \) and \( [\nu] \preceq [\rho] \) then \( [\mu] \preceq [\rho] \), the transitive property

Suppose that \( \mu \) and \( \nu \) are measures on \( (S, \ms S) \). Then

If \( \mu \perp \nu \) then \( \nu \perp \mu \), the symmetric property.
\( \mu \perp \mu \) if and only if \( \mu = \bs 0 \), the zero measure.

Details:

Part (a) is trivial from the symmetry of the definition. For part (b), note that \( S \) is null for \( 0 \) and \( \emptyset \) is null for \( 0 \), so \( 0 \perp 0 \). Conversely, suppose that \( \mu \) is a measure and \( \mu \perp \mu \). Then there exists \( A \in \ms S \) such that \( A \) is null for \( \mu \) and \( A^c \) is null for \( \mu \). But then \( S = A \cup A^c \) is null for \( \mu \), so \( \mu(B) = 0 \) for every \( B \in \ms S \).

Absolute continuity and singularity are preserved under multiplication by nonzero constants.

Suppose that \( \mu \) and \( \nu \) are measures on \( (S, \ms S) \) and that \( a, \, b \in \R \setminus \{0\} \). Then

\( \nu \ll \mu \) if and only if \( a \nu \ll b \mu \).
\( \nu \perp \mu \) if and only if \( a \nu \perp b \mu \).

Details:

Recall that if \( c \ne 0 \), then \( A \in \ms S \) is null for \( \mu \) if and only if \( A \) is null for \( c \mu \).

Suppose that \( \mu \) is a measure on \( (S, \ms S) \) and that \( \nu_i \) is a measure on \( (S, \ms S) \) for each \( i \) in a countable index set \( I \). Suppose also that \( \nu = \sum_{i \in I} \nu_i \) is a well-defined measure on \( (S, \ms S) \).

If \( \nu_i \ll \mu \) for every \( i \in I \) then \( \nu \ll \mu \).
If \( \nu_i \perp \mu \) for every \( i \in I \) then \( \nu \perp \mu \).

Details:

Recall that if \( A \in \ms S \) is null for \( \nu_i \) for each \(i \in I \), then \( A \) is null for \( \nu = \sum_{i \in I} \nu_i \), assuming that this is a well-defined measure.

As before, note that \( \nu = \sum_{i \in I} \nu_i \) is well defined if \( \nu_i \) is a positive measure for each \( i \in I \) or if \( I \) is finite and \( \nu_i \) is a finite measure for each \( i \in I \). We close this subsection with a couple of results that involve both the absolute continuity relation and the singularity relation

Suppose that \( \mu \), \( \nu \), and \( \rho \) are measures on \( (S, \ms S) \). If \( \nu \ll \mu \) and \( \mu \perp \rho \) then \( \nu \perp \rho \).

Details:

Since \( \mu \perp \rho \), there exists \( A \in \ms S \) such that \( A \) is null for \( \mu \) and \( A^c \) is null for \( \rho \). But \( \nu \ll \mu \) so \( A \) is null for \( \nu \). Hence \( \nu \perp \rho \).

Suppose that \( \mu \) and \( \nu \) are measures on \( (S, \ms S) \). If \( \nu \ll \mu \) and \( \nu \perp \mu \) then \( \nu = \bs 0 \).

Details:

From (with \( \rho = \nu \)) we have \( \nu \perp \nu \) and hence by it follows that \( \nu = \bs 0 \).

Density Functions

We are now ready for our study of density functions. Throughout this subsection, we assume that \( \mu \) is a positive, \( \sigma \)-finite measure on our measurable space \( (S, \ms S) \). Recall that if \(f: S \to \R\) is measurable, then the integral of \(f\) with respect to \(\mu\) may exist as a number in \(\R^* = \R \cup \{-\infty, \infty\}\) (in which case \(f\) is integrable) and in particular may exist as a number in \(\R\) (in which case \(f\) is absolutely integrable). Of course, the integral may also fail to exist.

Suppose that \( f: S \to \R \) is integrable with respect to \( \mu \). Then function \( \nu \) defined by \[ \nu(A) = \int_A f \, d\mu, \quad A \in \ms S \] is a \( \sigma \)-finite measure on \( (S, \ms S) \) that is absolutely continuous with respect to \( \mu \). The function \( f \) is a density function of \( \nu \) relative to \( \mu \).

Details:

Recall that \(f\) is ingegrable if \( \int_S f^+ \, d \mu \lt \infty \) or \( \int_S f^- \, d\mu \lt \infty \), where as usual, \( f^+ \) and \( f^- \) are the positive and negative parts of \( f \). So \( \nu(A) = \nu_+(A) - \nu_-(A) \) for \( A \in \ms S \) where \( \nu_+(A) = \int_A f^+(A) \, d\mu \) and \( \nu_-(A) = \int_A f^-(A) \, d\mu \). Both \( \nu_+ \) and \( \nu_- \) are positive measures by basic properties of the integral: Generically, suppose \( g: S \to [0, \infty) \) is measurable. The integral over the empty set is always 0, so \( \int_\emptyset g \, d\mu = 0 \). Next, if \( \{A_i: i \in I\} \) is a countable, disjoint collection of sets in \( \ms S \) and \( A = \bigcup_{i \in I} A_i \), then by the additivity property of the integral over disjoint domains, \[ \int_A g \, d\mu = \sum_{i \in I} \int_{A_i} g \, d\mu \] By the assumption that \(f\) is intebrable, either \( \nu_+ \) or \( \nu_- \) is a finite positive measure, and hence \( \nu \) is a measure. As you might guess, \( \nu_+ \) and \( \nu_- \) form the Jordan decomposition of \( \nu \), a point that we will revisit below.

Again, either \( \nu_+ \) or \( \nu_- \) is a finite measure. By symmetry, let's suppose that \( \nu_- \) is finite. Then to show that \( \nu \) is \( \sigma \)-finite, we just need to show that \( \nu_+ \) is \( \sigma \)-finite. Since \( \mu \) has this property, there exists a collection \( \{A_n: n \in \N_+\} \) with \( A_n \in \ms S \), \( \mu(A_n) \lt \infty \), and \( \bigcup_{n=1}^\infty A_n = S \). Let \( B_n = \{x \in S: f^+(x) \le n\} \) for \( n \in \N_+ \). Then \( B_n \in \ms S \) for \( n \in \N_+ \) and \( \bigcup_{n=1}^\infty B_n = S \). Hence \( \{A_m \cap B_n: (m, n) \in \N_+^2\} \) is a countable collection of measurable sets whose union is also \( S \). Moreover, \[ \nu_+(A_m \cap B_n) = \int_{A_m \cap B_n} f^+ d\mu \le n \mu(A_m \cap B_n) \lt \infty \] Finally, suppose \( A \in \ms S \) is a null set of \( \mu \). If \( B \in \ms S \) and \( B \subseteq A \) then \( \mu(B) = 0 \) so \( \nu(B) = \int_B f \, d\mu = 0 \). Hence \( \nu \ll \mu \).

In the context of ,

If \( f \) is nonnegative (so that the integral exists in \(\R \cup \{\infty\}\)) then \( \nu \) is a positive measure since \( \nu(A) \ge 0 \) for \( A \in \ms S \).
If \( f \) is absolutely integrable (so that the integral exists in \(\R\)), then \( \nu \) is a finite measure since \( \nu(A) \in \R \) for \( A \in \ms S \).
If \( f \) is nonnegative and \( \int_S f \, d\mu = 1 \) then \( \nu \) is a probability measure since \( \nu(A) \ge 0 \) for \( A \in \ms S \) and \( \nu(S) = 1 \).

In part (c) of , \( f \) is the probability density function of \( \nu \) relative to \( \mu \), our favorite kind of density function. When they exist, density functions are essentially unique.

Suppose that \( \nu \) is a \( \sigma \)-finite measure on \( (S, \ms S) \) and that \( \nu \) has density function \( f \) with respect to \( \mu \). Then \( g: S \to \R \) is a density function of \( \nu \) with respect to \( \mu \) if and only if \( f = g \) almost everywhere on \( S \) with respect to \( \mu \).

Details:

These results also follow from basic properties of the integral. Suppose that \( f, \, g: S \to \R \) are integrable with respect to \( \mu \). If \( g = f \) almost everywhere on \( S \) with respect to \( \mu \) then \( \int_A f \, d\mu = \int_A g \, d\mu \) for every \( A \in \ms S \). Hence if \( f \) is a density function for \( \nu \) with respect to \( \mu \) then so is \( g \). For the converse, if \( \int_A f \, d\mu = \int_A g \, d\mu \) for every \( A \in \ms S \), then since \( \mu \) is \( \sigma \)-finite, it follows that \( f = g \) almost everywhere on \( S \) with respect to \( \mu \).

The essential uniqueness of density functions can fail if the positive measure space \( (S, \ms S, \mu) \) is not \( \sigma \)-finite. A simple counterexample is given in . Our next result answers the question of when a measure has a density function with respect to \( \mu \), and is the fundamental theorem of this section. The theorem is in two parts: Part (a) is the Lebesgue decomposition theorem, named for our old friend Henri Lebesgue. Part (b) is the Radon-Nikodym theorem, named for Johann Radon and Otto Nikodym. We combine the theorems because our proofs of the two results are inextricably linked.

Suppose that \( \nu \) is a \( \sigma \)-finite measure on \( (S, \ms S) \).

Lebesgue Decomposition Theorem. \( \nu \) can be uniquely decomposed as \( \nu = \nu_c + \nu_s \) where \( \nu_c \ll \mu \) and \( \nu_s \perp \mu \).
Radon-Nikodym Theorem. \( \nu_c \) has a density function with respect to \( \mu \).

Details:

The proof proceeds in stages. we first prove the result for finite, positive measures, then for \( \sigma \)-finite, positive measures, and finally for general \( \sigma \)-finite measures. The first stage is the most complicated.

Part 1, suppose that \( \mu \) and \( \nu \) are positive, finite measures. Let \( \ms{F} \) denote the collection of measurable functions \( g: S \to [0, \infty) \) with \( \int_A g \, d\mu \le \nu(A) \) for all \( A \in \ms S \). Note that \( \ms{F} \ne \emptyset\) since the constant function \( 0 \) is in \( \ms{F} \). The proof works by finding a maximal element of \( \ms{F} \) and using this function as the density function of the absolutely continuous part of \( \nu \).

Our first step is to show that \( \ms{F} \) is closed under the max operator. Let \( g_1, \; g_2 \in \ms{F} \). For \( A \in \ms S \), let \( A_1 = \{x \in A: g_1(x) \ge g_2(x)\} \) and \( A_2 = \{x \in A: g_1(x) \lt g_2(x)\} \). Then \( A_1, \; A_2 \in \ms S \) partition \( A \) so \[ \int_A \max\{g_1, g_2\} \, d\mu = \int_{A_1} \max\{g_1, g_2\} \, d\mu + \int_{A_2} \max\{g_1, g_2\} d\mu = \int_{A_1} g_1 \, d\mu + \int_{A_2} g_2 \, d\mu \le \nu(A_1) + \nu(A_2) = \nu(A) \] Hence \( \max\{g_1, g_2\} \in \ms{F} \).

Our next step is to show that \( \ms{F} \) is closed with respect to increasing limits. Thus suppose that \( g_n \in \ms{F} \) for \( n \in \N_+ \) and that \( g_n \) is increasing in \( n \) on \( S \). Let \( g = \lim_{n \to \infty} g_n \). Then \( g: S \to [0, \infty] \) is measurable, and by the monotone convergence theorem, \( \int_A g \, d\mu = \lim_{n \to \infty} \int_A g_n \, d\mu \) for every \( A \in \ms S \). But \( \int_A g_n \, d\mu \le \nu(A) \) for every \( n \in \N_+ \) so \( \int_A g \, d\mu \le \nu(A) \). In particular, \( \int_S g \, d\mu \le \nu(S) \lt \infty \) so \( g \lt \infty \) almost everywhere on \( S \) with respect to \( \mu \). Thus, by redefining \( g \) on a \( \mu \)-null set if necessary, we can assume \( g \lt \infty \) on \( S \). Hence \( g \in \ms{F} \).

Now let \( \alpha = \sup\left\{\int_S g \, d\mu: g \in \ms{F}\right\} \). Note that \( \alpha \le \nu(S) \lt \infty\). By definition of the supremum, for each \( n \in \N_+ \) there exist \( g_n \in \ms{F} \) such that \( \int_S g_n \, d\mu \gt \alpha - \frac{1}{n} \). Now let \( f_n = \max\{g_1, g_2, \ldots, g_n\} \) for \( n \in \N_+ \). Then \( f_n \in \ms{F} \) and \( f_n \) is increasing in \( n \in \N_+ \) on \( S \). Hence \( f = \lim_{n \to \infty} f_n \in \ms{F} \) and \( \int_S f \, d\mu = \lim_{n \to \infty} \int_S f_n \, d\mu \). But \( \int_S f_n \, d\mu \ge \int_S g_n \, d\mu \gt \alpha - \frac{1}{n} \) for each \( n \in \N_+ \) and hence \( \int_S f \, d\mu \ge \alpha \).

Define \( \nu_c(A) = \int_A f \, d\mu \) and \( \nu_s(A) = \nu(A) - \nu_c(A) \) for \( A \in \ms S \). Then \( \nu_c \) and \( \nu_s \) are finite, positive measures and by our previous theorem, \( \nu_c \) is absolutely continuous with respect to \( \mu \) and has density function \( f \). Our next step is to show that \( \nu_s \) is singular with respect to \( \mu \). For \( n \in \N \), let \( (P_n, P_n^c) \) denote a Hahn decomposition of the measure \( \nu_s - \frac{1}{n} \mu \). Then \[ \int_A \left(f + \frac{1}{n} \bs{1}_{P_n}\right) \, d\mu = \nu_c(A) + \frac{1}{n} \mu(P_n \cap A) = \nu(A) - \left[\nu_s(A) - \frac{1}{n} \mu(P_n \cap A)\right] \] But \( \nu_s(A) - \frac{1}{n} \mu(P_n \cap A) \ge \nu_s(A \cap P_n) - \frac{1}{n} \mu(A \cap P_n) \ge 0 \) since \( \nu_s \) is a positive measure and \( P_n \) is positive for \( \nu_s - \frac{1}{n} \mu \). Thus we have \( \int_A \left(f + \frac{1}{n} \bs{1}_{P_n} \right) \, d\mu \le \nu(A) \) for every \( A \in \ms S \), so \( f + \frac{1}{n} \bs{1}_{P_n} \in \ms{F} \) for every \( n \in \N_+ \). If \( \mu(P_n) \gt 0 \) then \( \int_S \left(f + \frac{1}{n} \bs{1}_{P_n}\right) \, d\mu = \alpha + \frac{1}{n} \mu(P_n) \gt \alpha \), which contradicts the definition of \( \alpha \). Hence we must have \( \mu(P_n) = 0 \) for every \( n \in \N_+ \). Now let \( P = \bigcup_{n=1}^\infty P_n \). Then \( \mu(P) = 0 \). If \( \nu_s(P^c) \gt 0 \) then \( \nu_s(P^c) - \frac{1}{n} \mu(P^c) \gt 0 \) for \( n \) sufficiently large. But this is a contradiction since \( P^c \subseteq P_n^c \) which is negative for \( \nu_s - \frac{1}{n} \mu \) for every \( n \in \N_+ \). Thus we must have \( \nu_s(P^c) = 0 \), so \( \mu \) and \( \nu_s \) are singular.

Part 2. Suppose that \( \mu \) and \( \nu \) are \( \sigma \)-finite, positive measures. Then there exists a countable partition \( \{S_i: i \in I\} \) of \( S \) where \( S_i \in \ms S \) for \( i \in I \), and \( \mu(S_i) \lt \infty \) and \( \nu(S_i) \lt \infty \) for \( i \in I \). Let \( \mu_i(A) = \mu(A \cap S_i) \) and \( \nu_i(A) = \nu(A \cap S_i) \) for \( i \in I \). Then \( \mu_i \) and \( \nu_i \) are finite, positive measures for \( i \in I \), and \( \mu = \sum_{i \in I} \mu_i \) and \( \nu = \sum_{i \in I} \nu_i \). By part 1, for each \( i \in I \), there exists a measurable function \( f_i: S \to [0, \infty) \) such that \( \nu_i = \nu_{i,c} + \nu_{i,s} \) where \( \nu_{i, c}(A) = \int_A f_i \, d\mu \) for \( A \in \ms S \) and \( \nu_{i,s} \perp \mu \). Let \( f = \sum_{i \in I} \bs{1}_{A_i} f_i \). Then \( f: S \to [0, \infty) \) is measurable. Define \( \nu_c(A) = \int_A f \, d\mu \) and \( \nu_s(A) = \nu(A) - \nu_c(A) \) for \( A \in \ms S \). Note that \( \nu_c = \sum_{i \in I} \nu_{i,c} \) and \( \nu_s = \sum_{i \in I} \nu_{i,s} \). Then \( \nu_c \ll \mu \) and has density function \( f \) and \( \nu_s \perp \mu \).

Part 3. Suppose that \( \nu \) is a \( \sigma \)-finite measure (not necessarily positive). By the Jordan decomposition theorem, \( \nu = \nu_+ - \nu_- \) where \( \nu_+ \) and \( \nu_- \) are \( \sigma \)-finite, positive measures, and at least one is finite. By part 2, there exist measurable functions \( f_+: S \to [0, \infty) \) and \( f_-: S \to [0, \infty) \) such that \( \nu_+ = \nu_{+,c} + \nu_{+,s} \) and \( \nu_- = \nu_{-,c} + \nu_{-,s} \) where \( \nu_{+,c}(A) = \int_A f_+ \, d\mu \), \( \nu_{-,c} = \int_A f_- \, d\mu \) for \( A \in \ms S \), and \( \nu_{+,s} \perp \mu \), \( \nu_{-,s} \perp \mu \). Let \( f = f_+ - f_- \), \( \nu_c(A) = \int_A f \, d\mu \), \(\nu_s(A) = \nu(A) - \nu_c(A) \) for \( A \in \ms S \). Then \( \nu = \nu_c + \nu_s \) and \( \nu_s = \nu_{+,s} - \nu_{-,s} \perp \mu \).

Uniqueness. Suppose that \( \nu = \nu_{c,1} + \nu_{s,1} = \nu_{c,2} + \nu_{s,2} \) where \( \nu_{c,i} \ll \mu \) and \( \nu_{s,i} \perp \mu \) for \( i \in \{1, 2\} \). Then \( \nu_{c,1} - \nu_{c,2} = \nu_{s,2} - \nu_{s,1} \). But \( \nu_{c,1} - \nu_{c,2} \ll \mu \) and \( \nu_{s,2} - \nu_{s,1} \perp \mu \) so \( \nu_{c,1} - \nu_{c,2} = \nu_{s,2} - \nu_{s,1} = \bs 0 \) by

In particular, a measure \( \nu \) on \( (S, \ms S) \) has a density function with respect to \( \mu \) if and only if \( \nu \ll \mu \). The density function in this case is also referred to as the Radon-Nikodym derivative of \( \nu \) with respect to \( \mu \) and is sometimes written in derivative notation as \( d\nu / d\mu \). This notation, however, can be a bit misleading because we need to remember that a density function is unique only up to a \( \mu \)-null set. Also, the Radon-Nikodym theorem can fail if the positive measure space \( (S, \ms S, \mu) \) is not \( \sigma \)-finite. A couple of counterexamples are given in and . Next we characterize the Hahn decomposition of \(\nu\) and the Jordan decomposition of \(\nu\) in terms of the density function.

Suppose that \( \nu \) is a measure on \( (S, \ms S) \) with \( \nu \ll \mu \), and that \( \nu \) has density function \( f \) with respect to \( \mu \). Let \( P = \{x \in S: f(x) \ge 0\} \), and let \( f^+ \) and \( f^- \) denote the positive and negative parts of \( f \).

A Hahn decomposition of \( \nu \) is \( (P, P^c) \).
The Jordan decomposition is \( \nu = \nu_+ - \nu_- \) where \( \nu_+(A) = \int_A f^+ \, d\mu \) and \( \nu_-(A) = \int_A f^- \, d\mu\), for \( A \in \ms S \).

Details:

Of course \(P^c = \{x \in S: f(x) \lt 0\}\). The proofs are simple.

Suppose that \(A \in \ms S\). If \(A \subseteq P\) then \(f(x) \ge 0\) for \(x \in A\) and hence \(\nu(A) = \int_A f \, d\mu \ge 0\). If \(A \subseteq P^c\) then \(\nu(A) = \int_A f \, d\mu \le 0\).
This follows immediately from (a) and the Jordan decomposition theorem, since \(\nu_+(A) = \nu(A \cap P)\) and \(\nu_-(A) = -\nu(A \cap P^c)\) for \(A \in \ms S\). Note that \( f^+ = \bs 1_P f \) and \( f^- = -\bs 1_{P^c} f \).

Suppose that \( \nu \) is a positive measure on \( (S, \ms S) \) with \( \nu \ll \mu \) and that \( \nu \) has density function \( f \) with respect to \( \mu \). If \( g: S \to \R \) is integrable with respect to \( \nu \) then \[ \int_S g \, d\nu = \int_S g f \, d\mu \]

Details:

The proof is a classical bootstrapping argument. Suppose first that \( g = \sum_{i \in I} a_i \bs{1}_{A_i} \) is a nonnegative simple function. That is, \( I \) is a finite index set, \( a_i \in [0, \infty) \) for \( i \in I \), and \( \{A_i: i \in I\} \) is a disjoint collection of sets in \( \ms S \). Then \( \int_S g \, d\nu = \sum_{i \in I} a_i \nu(A_i) \). But \( \nu(A_i) = \int_{A_i} f \, d\mu = \int_S \bs{1}_{A_i} f \, d\mu \) for each \( i \in I \) so \[ \int_S g \, d\mu = \sum_{i \in I} a_i \int_S \bs{1}_{A_i} f \, d\mu = \int_S \left(\sum_{i \in I} a_i \bs{1}_{A_i}\right) f \, d\mu = \int_S g f \, d\mu \] Suppose next that \( g: S \to [0, \infty) \) is measurable. There exists a sequence of nonnegative simple functions \( (g_1, g_2, \ldots) \) such that \( g_n \) is increasing in \( n \in \N_+ \) on \( S \) and \( g_n \to g \) as \( n \to \infty \) on \( S \). Since \( f \) is nonnegative, \( g_n f \) is increasing in \( n \in \N_+ \) on \( S \) and \( g_n f \to g f \) as \( n \to \infty \) on \( S \). By the first step, \( \int_S g_n \, d\nu = \int_S g_n f \, d\mu \) for each \( n \in \N_+ \). But by the monotone convergence theorem, \( \int_S g_n \, d\nu \to \int_S g \, d\nu \) and \( \int_S g_n f \, d\mu \to \int_S g f \, d\mu \) as \( n \to \infty \). Hence \( \int_S g \, d\nu = \int_S g f \, d\mu \).

Finally, suppose that \( g: S \to \R \) is a measurable function whose integral with respect to \( \nu \) exists. By the previous step, \( \int_S g^+ \, d\nu = \int_S g^+ f \, d\mu \) and \( \int_S g^- \, d\nu = \int_S g^- f \, d\mu \), and at least one of these integrals is finite. Hence by the additive property \[ \int_S g \, d\nu = \int_S g^+ \, d\nu - \int_S g^- \, d\nu = \int_S g^+ f \, d\mu - \int_S g^- f \, d\mu = \int_S (g^+ - g^-) f \, d\mu = \int_S g f \, d\mu \]

In differential notation, the change of variables theorem has the familiar form \( d\nu = f \, d\mu \), and this is really the justification for the derivative notation \( f = d\nu / d\mu \) in the first place. The following result gives the scalar multiple rule for density functions.

Suppose that \( \nu \) is a measure on \( (S, \ms S) \) with \( \nu \ll \mu \) and that \( \nu \) has density function \( f \) with respect to \( \mu \). If \( c \in \R \), then \( c \nu \) has density function \( c f \) with respect to \( \mu \).

Details:

If \( A \in \ms S \) then \( \int_A c f \, d\mu = c \int_A f \, d\mu = c \nu(A) \).

Of course, we already knew that \( \nu \ll \mu \) implies \( c \nu \ll \mu \) for \( c \in \R \), so the new information is the relation between the density functions. In derivative notation, the scalar multiple rule has the familiar form \[ \frac{d(c \nu)}{d\mu} = c \frac{d\nu}{d\mu} \]

The following result gives the sum rule for density functions. Recall that two measures are of the same type if neither takes the value \( \infty \) or if neither takes the value \( -\infty \).

Suppose that \( \nu \) and \( \rho \) are measures on \( (S, \ms S) \) of the same type with \( \nu \ll \mu \) and \( \rho \ll \mu \), and that \( \nu \) and \( \rho \) have density functions \( f \) and \( g \) with respect to \( \mu \), respectively. Then \( \nu + \rho \) has density function \( f + g \) with respect to \( \mu \).

Details:

If \( A \in \ms S \) then \[ \int_A (f + g) \, d\mu = \int_A f \, d\mu + \int_A g \, d\mu = \nu(A) + \rho(A) \] The additive property holds because we know that the integrals in the middle of the displayed equation are not of the form \( \infty - \infty \).

Of course, we already knew that \( \nu \ll \mu \) and \( \rho \ll \mu \) imply \( \nu + \rho \ll \mu \), so the new information is the relation between the density functions. In derivative notation, the sum rule has the familiar form \[ \frac{d(\nu + \rho)}{d\mu} = \frac{d\nu}{d\mu} + \frac{d\rho}{d\mu} \] The following result is the chain rule for density functions.

Suppose that \( \nu \) is a positive measure on \( (S, \ms S) \) with \( \nu \ll \mu \) and that \( \nu \) has density function \( f \) with respect to \( \mu \). Suppose \( \rho \) is a measure on \( (S, \ms S) \) with \( \rho \ll \nu \) and that \( \rho \) has density function \( g \) with respect to \( \nu \). Then \( \rho \) has density function \( g f \) with respect to \( \mu \).

Details:

This is a simple consequence of the change of variables theorem . If \( A \in \ms S \) then \( \rho(A) = \int_A g \, d\nu = \int_A g f \, d\mu \).

Of course, we already knew that \( \nu \ll \mu \) and \( \rho \ll \nu \) imply \( \rho \ll \mu \), so once again the new information is the relation between the density functions. In derivative notation, the chain rule has the familiar form \[ \frac{d\rho}{d\mu} = \frac{d\rho}{d\nu} \frac{d\nu}{d\mu}\] The following related result is the inverse rule for density functions.

Suppose that \( \nu \) is a positive measure on \( (S, \ms S) \) with \( \nu \ll \mu \) and \( \mu \ll \nu \) (so that \( \nu \equiv \mu \)). If \( \nu \) has density function \( f \) with respect to \( \mu \) then \( \mu \) has density function \( 1 / f \) with respect to \( \nu \).

Details:

Let \( f \) be a density function of \( \nu \) with respect to \( \mu \) and let \( Z = \{x \in S: f(x) = 0\} \). Then \( \nu(Z) = \int_Z f \, d\mu = 0 \) so \( Z \) is a null set of \( \nu \) and hence is also a null set of \( \mu \). Thus, we can assume that \( f \ne 0 \) on \( S \). Let \( g \) be a density of \( \mu \) with respect to \( \nu \). Since \( \mu \ll \nu \ll \mu \), it follows from the chain rule that \( f g \) is a density of \( \mu \) with respect to \( \mu \). But of course the constant function \( 1 \) is also a density of \( \mu \) with respect to itself so we have \( f g = 1 \) almost everywhere on \( S \). Thus \( 1 / f \) is a density of \( \mu \) with respect to \( \nu \).

In derivative notation, the inverse rule has the familiar form \[ \frac{d\mu}{d\nu} = \frac{1}{d\nu / d\mu}\]

Examples and Special Cases

Discrete Spaces

Recall that a discrete measure space \((S, \ms S, \#)\) consists of a countable set \( S \) with the \(\sigma\)-algebra \( \ms S = \ms P(S) \) of all subsets of \( S \), and with counting measure \( \# \). Of course \( \# \) is a positive measure and is trivially \( \sigma \)-finite since \( S \) is countable. Note also that \( \emptyset \) is the only set that is null for \( \# \). If \( \nu \) is a measure on \( S \), then by definition, \( \nu(\emptyset) = 0 \), so \( \nu \) is absolutely continuous relative to \( \mu \). Thus, by the Radon-Nikodym theorem, \( \nu \) can be written in the form \[ \nu(A) = \sum_{x \in A} f(x), \quad A \subseteq S \] for a unique \( f: S \to \R \). Of course, this is obvious by a direct argument. If we define \( f(x) = \nu\{x\} \) for \( x \in S \) then the displayed equation follows by the countable additivity of \( \nu \).

Spaces Generated by Countable Partitions

We can generalize the last discussion to spaces generated by countable partitions. Suppose that \( S \) is a set and that \( \ms A = \{A_i: i \in I\} \) is a countable partition of \( S \) into nonempty sets. Let \( \ms S = \sigma(\ms A) \) and recall that every \( A \in \ms S \) has a unique representation of the form \( A = \bigcup_{j \in J} A_j \) where \( J \subseteq I \). Suppse now that \( \mu \) is a positive measure on \( \ms S \) with \( 0 \lt \mu(A_i) \lt \infty \) for every \( i \in I \). Then once again, the measure space \( (S, \ms S, \mu) \) is \( \sigma \)-finite and \( \emptyset \) is the only null set. Hence if \( \nu \) is a measure on \( (S, \ms S) \) then \( \nu \) is absolutely continuous with respect to \( \mu \) and hence has unique density function \( f \) with respect to \( \mu \): \[ \nu(A) = \int_A f \, d\mu, \quad A \in \ms S \] Once again, we can construct the density function explicitly.

In the setting above, define \( f: S \to \R \) by \( f(x) = \nu(A_i) / \mu(A_i) \) for \( x \in A_i \) and \( i \in I \). Then \( f \) is the density of \( \nu \) with respect to \( \mu \).

Details:

Suppose that \( A \in \ms S \) so that \( A = \bigcup_{j \in J} A_j \) for some \( J \subseteq I \). Then \[ \int_A f \, d\mu = \sum_{j \in J} \int_{A_j} f \, d\mu = \sum_{j \in J} \frac{\nu(A_j)}{\mu(A_j)} \mu(A_j) = \sum_{j \in J} \nu(A_j) = \nu(A) \]

Often positive measure spaces that occur in applications can be decomposed into spaces generated by countable partitions. In the section on on martinagle convergence, we show that more general density functions can be obtained as limits of density functions of the type in .

Counterexamples

The essential uniqueness of density functions can fail if the underlying positive measure \( \mu \) is not \( \sigma \)-finite. Here is a trivial counterexample:

Suppose that \( S \) is a nonempty set and that \( \ms S = \{S, \emptyset\} \) is the trivial \( \sigma \)-algebra. Define the positive measure \( \mu \) on \( (S, \ms S) \) by \( \mu(\emptyset) = 0 \), \( \mu(S) = \infty \). Let \( \nu_c \) denote the measure on \( (S, \ms S) \) with constant density function \( c \in \R \) with respect to \( \mu \).

\( (S, \ms S, \mu) \) is not \( \sigma \)-finite.
\( \nu_c = \mu \) for every \( c \in (0, \infty) \).

The Radon-Nikodym theorem can fail if the measure \( \mu \) is not \( \sigma \)-finite, even if \( \nu \) is finite. Here are a couple of standard counterexample:

Suppose that \( S \) is an uncountable set and \( \ms S \) is the \( \sigma \)-algebra of countable and co-countable sets: \[\ms S = \{A \subseteq S: A \text{ is countable or } A^c \text{ is countable} \} \] As usual, let \( \# \) denote counting measure on \( \ms S \), and define \( \nu \) on \( \ms S \) by \( \nu(A) = 0 \) if \( A \) is countable and \( \nu(A) = 1 \) if \( A^c \) is countable. Then

\( (S, \ms S, \#) \) is not \( \sigma \)-finite.
\( \nu \) is a finite, positive measure on \( (S, \ms S) \).
\( \nu \) is absolutely continuous with respect to \( \# \).
\( \nu \) does not have a density function with respect to \( \# \).

Details:

Recall that a countable union of countable sets is countable, and so \( S \) cannot be written as such a union.
Note that \( \nu(\emptyset) = 0 \). Suppose that \( \{A_i: i \in I\} \) is a countable, disjoint collection of sets in \( \ms S \). If \( A_i \) is countable for every \( i \in I \) then \( \bigcup_{i \in I} A_i \) is countable. Hence \( \nu\left(\bigcup_{i \in I} A_i\right) = 0 \) and \( \nu(A_i) = 0 \) for every \( i \in I \). Next suppose that \( A_j^c \) and \( A_k^c \) are countable for distinct \( j, \; k \in I \). Since \( A_j \cap A_k = \emptyset \), we have \( A_j^c \cup A_k^c = S \). But then \( S \) would be countable, which is a contradiction. Hence it is only possible for to have \( A_j^c \) countable for a single \( j \in I \). In this case, \( \nu(A_j) = 1 \) and \( \nu(A_i) = 0 \) for \( i \ne j \). But also \( \left(\bigcup_{i \in I} A_i\right)^c = \bigcap_{i \in I} A_i^c \) is countable, so \( \nu\left(\bigcup_{i \in I} A_i\right) = 1 \). Hence in all cases, \( \nu\left(\bigcup_{i \in I} A_i \right) = \sum_{i \in I} \nu(A_i) \) so \( \nu \) is a measure on \( (S, \ms S) \). It is clearly positive and finite.
Recall that any measure is absolutely continuous with respect to counting measure, since \( \#(A) = 0 \) if and only if \( A = \emptyset \).
Suppose that \( \nu \) has density function \( f \) with respect to \( \# \). Then \(0 = \nu\{x\} = \int_{\{x\}} f \, d\# = f(x) \) for every \( x \in S \). But then \( \nu(S) = \int_S f \, d\# = 0 \), which is a contradiction.

18. Absolute Continuity and Density Functions

Basic Theory

Relations on Measures

Density Functions

Examples and Special Cases

Discrete Spaces

Spaces Generated by Countable Partitions

Counterexamples