\( \renewcommand{\P}{\mathbb{P}} \) \( \newcommand{\E}{\mathbb{E}} \) \( \newcommand{\R}{\mathbb{R}} \) \( \newcommand{\Q}{\mathbb{Q}} \) \( \newcommand{\N}{\mathbb{N}} \) \( \newcommand{\bs}{\boldsymbol} \) \( \newcommand{\ms}{\mathscr} \) \( \newcommand{\range}{\text{range}} \)
  1. Random
  2. 0. Foundations
  3. 1
  4. 2
  5. 3
  6. 4
  7. 5
  8. 6
  9. 7
  10. 8
  11. 9
  12. 10
  13. 11
  14. 12
  15. 13
  16. 14
  17. 15
  18. 16
  19. 17
  20. 18
  21. 19

16. Properties of the Integral

Basic Theory

Again our starting point is a measure space \( (S, \ms S , \mu) \), so that \( S \) is a set, \( \ms S \) is a \( \sigma \)-algebra of subsets of \( S \), and \( \mu \) is a positive measure on \( \ms S \).

Definition

In the last section we defined the integral of certain measurable functions \( f: S \to \R \) with respect to the measure \( \mu \). Recall that the integral, denoted \( \int_S f \, d\mu \), may exist as a number in \( \R \cup \{-\infty, \infty\} \) (in which case \( f \) is integrable), and in particular may exist as a number in \(\R\) (in which case \( f \) is absolutely integrable). The integral may also fail to exist. Here is a review of how the definition is built up in stages:

Definition of the integral

  1. If \( f \) is a nonnegative simple function, so that \( f = \sum_{i \in I} a_i \bs{1}_{A_i} \) where \( I \) is a finite index set, \( a_i \in [0, \infty) \) for \( i \in I \), and \( \{A_i: i \in I\} \) is measurable partition of \( S \), then \[ \int_S f \, d\mu = \sum_{i \in I} a_i \mu(A_i) \]
  2. If \( f: S \to [0, \infty) \) is measurable, then \[ \int_S f \, d\mu = \sup\left\{\int_S g \, d\mu: g \text{ is simple and } 0 \le g \le f\right\} \]
  3. If \( f: S \to \R \) is measurable, then \[ \int_S f \, d\mu = \int_S f^+ \, d\mu - \int_S f^- \, d\mu \] as long as the right side is not of the form \( \infty - \infty \), and where \( f^+ \) and \( f^- \) denote the positive and negative parts of \( f \).
  4. If \( f:S \to \R \) is measurable and \( A \in \ms S \), then the integral of \( f \) over \( A \) is defined by \[ \int_A f \, d\mu = \int_S \bs{1}_A f \, d\mu \] assuming that the integral on the right exists.

So \(f: S \to \R\) is integrable if it is measurable and either \(\int_S f^+ \, d\mu \lt \infty\) or \(\int_S f^- \, d \mu \lt \infty\). If both are finite, equivalently \(\int_S |f| \, d\mu \lt \infty\), then \(f\) is absolutely integrable.

Consider a statement on the elements of \(S \), for example an equation or an inequality with \( x \in S \) as a free variable. (Technically such a statement is a predicate on \( S \).) For \( A \in \ms S \), we say that the statement holds on \( A \) if it is true for every \( x \in A \). We say that the statement holds almost everywhere on \( A \) (with respect to \( \mu \)) if there exists \( B \in \ms S \) with \( B \subseteq A \) such that the statement holds on \( B \) and \( \mu(A \setminus B) = 0 \).

Basic Properties

A few properties of the integral that were essential to the motivation of the definition were given in the last section. In this section, we extend some of those properties and we study a number of new ones. As a review, here is what we know so far.

Properties of the integral

  1. If \( f, \, g: S \to \R \) are integrable, then \( \int_S (f + g) \, d\mu = \int_S f \, d\mu + \int_S g \, d\mu \) as long as the right side is not of the form \( \infty - \infty \).
  2. If \( f: S \to \R \) is integrable and \( c \in \R \), then \( \int_S c f \, d\mu = c \int_S f \, d\mu \).
  3. If \( f: S \to [0, \infty) \) is measurable then \(f\) is integrable and \( \int_S f \, d\mu \ge 0 \).
  4. If \( f, \, g: S \to \R \) are integrable and \( f \le g \) on \( S \) then \( \int_S f \, d\mu \le \int_S g \, d\mu \)
  5. If \( f_n: S \to [0, \infty) \) is measurable for \( n \in \N_+ \) and \( f_n \) is increasing in \( n \) on \( S \) then \(\int_S \lim_{n \to \infty} f_n \, d\mu = \lim_{n \to \infty} \int_S f_n \, d\mu \).
  6. If \( f: S \to \R \) is integrable on \( A \cup B \) exists, where \( A, \, B \in \ms S \) are disjoint, then \( \int_{A \cup B} f \, d\mu = \int_A f \, d\mu + \int_B f \, d\mu \).

Parts (a) and (b) are the linearity properties; part (a) is the additivity property and part (b) is the scaling property. Parts (c) and (d) are the order properties; part (c) is the positive property and part (d) is the increasing property. Part (e) is a continuity property known as the monotone convergence theorem. Part (f) is the additive property for disjoint domains. Properties (a)–(e) hold with \( S \) replaced by \( A \in \ms S \).

Equality and Order

Our first new results are extensions dealing with equality and order. The integral of a function over a null set is 0:

Suppose that \( f: S \to \R \) is measurable and \( A \in \ms S \) with \( \mu(A) = 0 \). Then \( \int_A f \, d\mu = 0 \).

Details:

The proof proceeds in stages via the definition of the integral.

  1. Suppose that \( g \) is a nonnegative simple function with \( g = 0 \) on \( A^c \). Then \( g \) has the representation \( g = \sum_{i \in I} a_i \bs{1}_{A_i} \) where \( a_i \in (0, \infty) \) and \( A_i \subseteq A \) for for \( i \in I \). But \( \mu(A_i) = 0 \) for each \( i \in I\) and so \( \int_S g \, d\mu = \sum_{i \in I} a_i \mu(A_i) = 0 \)
  2. Suppose that \( f: S \to [0, \infty) \) is measurable. If \( g \) is a nonnegative simple function with \( g \le \bs{1}_A f \), then \( g = 0 \) on \( A^c \) so by (a), \( \int_S g \, d\mu = 0 \). Hence by part (b) of , \( \int_A f \, d\mu = \int_S \bs{1}_A f \, d\mu = 0 \).
  3. Finally, suppose that \( f: S \to \R \) is measurable. Then \( \int_A f \, d\mu = \int_A f^+ \, d\mu - \int_A f^- \, d\mu \). But both integrals on the right are 0 by part (b).

Two functions that are indistinguishable from the point of view of \( \mu \) must have the same integral.

Suppose that \( f: S \to \R \) is integrable. If \( g: S \to \R \) is measurable and \( g = f \) almost everywhere on \( S \), then \( \int_S g \, d\mu = \int_S f \, d\mu \).

Details:

Note that \( g = f \) if and only if \( g^+ = f^+ \) and \( g^- = f^- \). Let \( A = \{x \in S: g^+(x) = f^+(x)\} \). Then \( A \in \ms S \) and \( \mu(A^c) = 0 \). Hence by the additivity property and , \[ \int_S g^+ \, d\mu = \int_A g^+ \, d\mu + \int_{A^c} g^+ \, d\mu = \int_A f^+ \, d\mu + 0 = \int_A f^+ \, d\mu + \int_{A^c} f^+ \, d\mu = \int_S f^+ \, d\mu \] Similarly \( \int_S g^- \, d\mu = \int_S f^- \, d\mu\). Hence the integral of \( g \) exists and \( \int_S g \, d\mu = \int_S f \, d\mu \)

Next we have a simple extension of the positive property.

Suppose that \( f: S \to \R \) is measurable and \( f \ge 0 \) almost everywhere on \( S \). Then

  1. \( \int_S f \, d\mu \ge 0 \)
  2. \( \int_S f \, = 0 \) if and only if \( f = 0 \) almost everywhere on \( S \).
Details:
  1. Let \( A = \{x \in S: f(x) \ge 0\} \). Then \( A \in \ms S \) and \( \mu(A^c) = 0 \). By the additivity of the integral over disjoint sets we have \[ \int_S f \, d\mu = \int_A f \, d\mu + \int_{A^c} f \, d\mu \] But \( \int_A f \, d\mu \ge 0 \) by the positive property and \( \int_{A^c} f \, d\mu = 0 \) by the null property , so \( \int_S f \, d\mu \ge 0 \).
  2. Note first that if \( \mu(A) = 0 \) then both integrals in the displayed equation are 0 so \( \int_S f \, d\mu = 0 \). For the converse, let \( B_n = \left\{x \in S: f(x) \ge \frac{1}{n}\right\} \) for \( n \in \N_+ \) and \( B = \{x \in S: f(x) \gt 0\} \). Then \( B_n \) is increasing in \( n \) and \( \bigcup_{n=1}^\infty B_n = B \). If \( \mu(B) \gt 0 \) then \( \mu(B_n) \gt 0 \) for some \( n \in \N_+ \). But \( f \ge \frac{1}{n} \bs{1}_{B_n} \) on \( A \), so by the increasing property, \( \int_S f \, d\mu = \int_A f \, d\mu \ge \int_A \frac{1}{n} \bs{1}_{B_n} \, d\mu = \frac{1}{n} \mu(B_n) \gt 0 \).

So, if \( f \ge 0 \) almost everywhere on \( S \) then \( \int_S f \, d\mu \gt 0 \) if and only if \( \mu\{x \in S: f(x) \gt 0\} \gt 0 \). The simple extension of the positive property in turn leads to a simple extension of the increasing property.

Suppose that \( f, \, g: S \to \R \) are integrable, and that \( f \le g \) almost everywhere on \( S \). Then

  1. \( \int_S f \le \int_S g \)
  2. Except in the case that both integrals are \( \infty \) or both \( -\infty \), \( \int_S f \, d\mu = \int_S g \, d\mu \) if and only if \( f = g \) almost everywhere on \( S \).
Details:
  1. Note that \( g = f + (g - f) \) and \( g - f \ge 0 \) almost everywhere on \( S \). If \( \int_S f \, d\mu = -\infty \) then trivially \( \int_S f \, d\mu \le \int_S g \, d\mu \). Otherwise, by the additive property, \[ \int_S g \, d\mu = \int_S f \, d\mu + \int_S (g - f) \, d\mu \] By the positive property , \(\int_S (g - f) \, d\mu \ge 0 \) so \( \int_S g \, d\mu \ge \int_S f \, d\mu \).
  2. Except in the case that both integrals are \( \infty \) or both are \( -\infty \) we have \[ \int_S g \, d\mu - \int_S f \, d\mu = \int_S (g - f) \, d\mu \] By assumption \( g - f \ge 0 \) almost everywhere on \( S \), and hence by the positive property , the integral on the right is 0 if and only if \( g - f = 0 \) almost everywhere on \( S \).

So if \( f \le g \) almost everywhere on \( S \) then, except in the two cases mentioned, \( \int_S f \, d\mu \lt \int_S g \, d\mu \) if and only if \( \mu\{x \in S: f(x) \lt g(x)\} \gt 0 \). The exclusion when both integrals are \( \infty \) or \( -\infty \) is important. A counterexample when this condition does not hold is given in . The next result is the absolute value inequality.

Suppose that \( f: S \to \R \) is integrable. Then \[ \left| \int_S f \, d\mu \right| \le \int_S \left|f \right| \, d\mu \] If \( f \) is absolutely integrable, then equality holds if and only if \( f \ge 0 \) almost everywhere on \( S \) or \( f \le 0 \) almost everywhere on \( S \).

Details:

First note that \( -\left|f\right| \le f \le \left|f\right| \) on \( S \). The integrals of all three functions exist, so the increasing property and scaling properties give \[ -\int_S \left|f\right| \, d\mu \le \int_S f \, d\mu \le \int_S \left|f \right| \, d\mu \] which is equivalent to the inequality above. If \( f \) is integrable, then by the increasing property , equality holds if and only if \( f = -\left|f\right| \) almost everywhere on \( S \) or \( f = \left|f\right| \) almost everywhere on \( S \). In the first case, \( f \le 0 \) almost everywhere on \( S \) and in the second case, \( f \ge 0 \) almost everywhere on \( S \).

Change of Variables

Suppose that \( (T, \ms T) \) is another measurable space and that \( u: S \to T \) is measurable. By the change of variables theorem for positive measures, \( \nu \) defined by \[ \nu(B) = \mu\left[u^{-1}(B)\right], \quad B \in \ms T \] is a positive measure on \( (T, \ms T) \). The following result is known as the change of variables theorem for integrals:

\( f: T \to \R \) is integrable with respect to \(\nu\) if and only if \(f \circ u\) is integrable with respect to \(\mu\) and \[ \int_T f \, d\nu = \int_S (f \circ u) \, d\mu \]

Details:

We will show that if either of the integrals exist then they both do, and are equal. The proof is a classical bootstrapping argument that parallels the definition of the integral.

  1. Suppose first that \( f \) is a nonnegative simple function on \( T \) with the representation \( f = \sum_{i \in I} b_i \bs{1}_{B_i} \) where \( I \) is a finite index set, \( \{B_i: i \in I\} \) is a measurable partition of \( T \), and \( b_i \in [0, \infty) \) for \( i \in I \). Recall that \( f \circ u \) is a nonnegative simple function on \( S \), with representation \( f \circ u = \sum_{i \in I} b_i \bs{1}_{u^{-1}(B_i)} \). Hence \[ \int_T f \, d\nu = \sum_{i \in I} b_i \nu(B_i) = \sum_{i \in I} b_i \mu\left[u^{-1}(B_i)\right] = \int_S (f \circ u) \, d\mu \]
  2. Next suppose that \( f: T \to [0, \infty) \) is measurable, so that \( f \circ u: S \to [0, \infty) \) is also measurable. There exists an increasing sequence \( (f_1, f_2, \ldots) \) of nonnegative simple functions on \( T \) with \( f_n \to f \) as \( n \to \infty \). Then \((f_1 \circ u, f_2 \circ u, \ldots)\) is an increasing sequence of simple functions on \( S \) with \( f_n \circ u \to f \circ u\) as \( n \to \infty \). By step (a), \( \int_T f_n \, d\nu = \int_S (f_n \circ u) \, d\mu \) for each \( n \in \N_+ \). But by the monotone convergence theorem, \( \int_T f_n \, d\nu \to \int_T f \, d\nu \) as \( n \to \infty \) and \( \int_S (f_n \circ u) \, d\mu \to \int_S (f \circ u) \, d\mu \) so we conclude that \( \int_T f \, d\nu = \int_S (f \circ u) \, d\mu \)
  3. Finally, suppose that \( f: T \to \R \) is measurable, so that \( f \circ u: S \to \R \) is also measurable. Note that \( (f \circ u)^+ = f^+ \circ u \) and \( (f \circ u)^- = f^- \circ u \). By part (b), \begin{align} \int_T f^+ \, d\nu & = \int_S (f^+ \circ u) \, d\mu = \int_S (f \circ u)^+ \, d\mu \\ \int_T f^- \, d\nu & = \int_S (f^- \circ u) \, d\mu = \int_S (f \circ u)^- \, d\mu \end{align} Assuming that at least one of the integrals in the displayed equations is finite, we have \[ \int_T f \, d\nu = \int_T f^+ \, d\nu - \int_T f^- \, d\nu = \int_S (f \circ u)^+ \, d\mu - \int_S (f \circ u)^- \, d\mu = \int_S (f \circ u) \, d\mu\]

The change of variables theorem will look more familiar if we give the variables explicitly. Thus, suppose that we want to evaluate \[ \int_S f\left[u(x)\right] \, d\mu(x) \] where again, \( u: S \to T \) and \( f: T \to \R \). One way is to use the substitution \( u = u(x) \), find the new measure \( \nu \), and then evaluate \[ \int_T g(u) \, d\nu(u) \]

Convergence Properties

We start with a simple but important corollary of the monotone convergence theorem that extends the additivity property to a countably infinite sum of nonnegative functions.

Suppose that \( f_n: S \to [0, \infty) \) is measurable for \( n \in \N_+ \). Then \[ \int_S \sum_{n=1}^\infty f_n \, d\mu = \sum_{n=1}^\infty \int_S f_n \, d\mu \]

Details:

Let \( g_n = \sum_{i=1}^n f_i \) for \( n \in \N_+ \). Then \( g_n: S \to [0, \infty) \) is measurable and \( g_n \) is increasing in \( n \). Moreover, by definition, \( g_n \to \sum_{i=1}^\infty f_i \) as \( n \to \infty \). Hence by the monotone convergence theorem, \( \int_S g_n \, d\mu \to \int_S \sum_{i=1}^\infty f_i \, d\mu \) as \( n \to \infty \). But we know the additivity property holds for finite sums, so \(\int_S g_n \, d\mu = \sum_{i=1}^n \int_S f_i \, d\mu\) and again, by definition, this sum converges to \(\sum_{i=1}^\infty \int_S f_i \, d\mu\) as \( n \to \infty \).

Theorem below gives a related result that relaxes the assumption that \( f \) be nonnegative, but imposes a stricter integrability requirement. Our next result is the additivity of the integral over a countably infinite collection of disjoint domains.

Suppose that \( f: S \to \R \) is integrable, and that \( \{A_n: n \in \N_+\} \) is a disjoint collection of sets in \( \ms S \). Let \( A = \bigcup_{n=1}^\infty A_n \). Then \[ \int_A f \, d\mu = \sum_{n=1}^\infty \int_{A_n} f \, d\mu \]

Details:

Suppose first that \( f \) is nonnegative. Note that \( \bs{1}_A = \sum_{n=1}^\infty \bs{1}_{A_n} \) and hence \( \bs{1}_A f = \sum_{n=1}^\infty \bs{1}_{A_n} f \). Thus from , \[ \int_A f \, d\mu = \int_S \bs{1}_A f \, d\mu = \int_S \sum_{n=1}^\infty \bs{1}_{A_n} f \, d\mu = \sum_{n=1}^\infty \int_S \bs{1}_{A_n} f \, d\mu = \sum_{n=1}^\infty \int_{A_n} f \, d\mu \] Suppose now that \( f: S \to \R \) is measurable and \( \int_S f \, d\mu \) exists. Note that for \( B \in \ms S \), \( \left(\bs{1}_B f\right)^+ = \bs{1}_B f^+ \) and \( \left(\bs{1}_B f\right)^- = \bs{1}_B f^- \). Hence from the previous argument, \[ \int_A f^+ \, d\mu = \sum_{n=1}^\infty \int_{A_n} f^+ \, d\mu, \quad \int_A f^- \, d\mu = \sum_{n=1}^\infty \int_{A_n} f^- \, d\mu \] Both of these are sums of nonnegative terms, and one of the sums, at least, is finite. Hence we can group the terms to get \[ \int_A f \, d\mu = \int_A f^+ \, d\mu - \int_A f^- \, d\mu = \sum_{n=1}^\infty \int_{A_n} (f^+ - f^-) \, d\mu = \sum_{n=1}^\infty \int_{A_n} f \, d\mu \]

Of course, applies if \( f \) is nonnegative or if \( f \) is absolutely integrable. Next we give a minor extension of the monotone convergence theorem that relaxes the assumption that the functions be nonnegative.

Monotone Convergence Theorem. Suppose that \( f_n: S \to \R \) is integrable for each \( n \in \N_+ \) and that \( f_n \) is increasing in \( n \) on \( S \). If \( \int_S f_1 \, d\mu \gt -\infty \) then \[ \int_S \lim_{n \to \infty} f_n \, d\mu = \lim_{n \to \infty} \int_S f_n \, d\mu \]

Details:

Let \( f(x) = \lim_{n \to \infty} f_n(x) \) for \( x \in S \) which exists in \( \R \cup \{\infty\} \) since \( f_n(x) \) is increasing in \( n \in \N_+ \). If \( \int_S f_1 \, d\mu = \infty \), then by the increasing property, \( \int_S f_n \, d\mu = \infty \) for all \( n \in \N_+ \) and \( \int_S f \, d\mu = \infty \), so the conclusion of the monotone convergence theorem trivially holds. Thus suppose that \( f_1 \) is integrable. Let \( g_n = f_n - f_1 \) for \( n \in \N \) and let \( g = f - f_1 \). Then \( g_n \) is nonnegative and increasing in \( n \) on \( S \), and \( g_n \to g \) as \( n \to \infty \) on \( S \). By the ordinary monotone convergence theorem, \( \int_S g_n \, d\mu \to \int_S g \, d\mu \) as \( n \to \infty \). But since \( \int_S f_1 \, d\mu \) is finite, \( \int_S g_n \, d\mu = \int_S f_n \, d\mu - \int_S f_1 \, d\mu \) and \( \int_S g \, d\mu = \int f \, d\mu - \int_S f_1 \, d\mu \). Again since \( \int_S f_1 \, d\mu \) is finite, it follows that \( \int_S f_n \, d\mu \to \int_S f \, d\mu \) as \( n \to \infty \).

Here is the complementary result for decreasing functions.

Suppose that \( f_n: S \to \R \) is integrable for each \( n \in \N_+ \) and that \( f_n \) is decreasing in \( n \) on \( S \). If \( \int_S f_1 \, d\mu \lt \infty \) then \[ \int_S \lim_{n \to \infty} f_n \, d\mu = \lim_{n \to \infty} \int_S f_n \, d\mu \]

Details:

The functions \( -f_n \) for \( n \in \N_+ \) satisfy the hypotheses of the monotone convergence theorem for increasing functions and hence \(\int_S \lim_{n \to \infty} -f_n \, d\mu = \lim_{n \to \infty} -\int_S f_n \, d\mu \). By the scaling property, \( \int_S \lim_{n \to \infty} f_n \, d\mu = \lim_{n \to \infty} \int_S f_n \, d\mu \).

The additional assumptions on the integral of \( f_1 \) in the last two extensions of the monotone convergence theorem are necessary. A counterexample is given in .

Our next result is also a consequence of the montone convergence theorem, and is called Fatou's lemma in honor of Pierre Fatou. Its usefulness stems from the fact that no assumptions are placed on the integrand functions, except that they be nonnegative and measurable.

Fatou's Lemma. Suppose that \( f_n: S \to [0, \infty) \) is measurable for \( n \in \N_+ \). Then \[ \int_S \liminf_{n \to \infty} f_n \, d\mu \le \liminf_{n \to \infty} \int_S f_n \, d\mu \]

Details:

Let \( g_n = \inf\left\{f_k: k \in \{n, n + 1, \ldots \}\right\} \) for \( n \in \N_+ \). Then \( g_n: S \to [0, \infty) \) is measurable for \( n \in \N_+ \), \( g_n \) is increasing in \( n \), and by definition, \( \lim_{n \to \infty} g_n = \liminf_{n \to \infty} f_n \). By the monotone convergence theorem, \[ \int_S \liminf_{n \to \infty} f_n \, d\mu = \lim_{n \to \infty} \int_S g_n \, d\mu \] But \( g_n \le f_k \) on \( S \) for \( n \in \N_+ \) and \( k \in \{n, n + 1, \ldots\} \) so by the increasing property, \( \int_S g_n \, d\mu \le \int_S f_k \, d\mu\) for \( n \in \N_+ \) and \( k \in \{n, n + 1, \ldots\} \). Hence \( \int_S g_n \, d\mu \le \inf\left\{\int_S f_k \, d\mu: k \in \{n, n+1, \ldots\}\right\} \) for \( n \in \N_+ \) and therefore \[ \lim_{n \to \infty} \int_S g_n \, d\mu \le \liminf_{n \to \infty} \int_S f_n \, d\mu \]

Given the weakness of the hypotheses, it's hardly surprising that strict inequality can easily occur in Fatou's lemma. An example is given in .

Our next convergence result is one of the most important and is known as the dominated convergence theorem. It's sometimes also known as Lebesgue's dominated convergence theorem in honor of Henri Lebesgue, who first developed all of this stuff in the context of the Euclidean measure space\( (\R^n, \ms R^n, \lambda^n) \). The dominated convergence theorem gives a basic condition under which we may interchange the limit and integration operators.

Dominated Convergence Theorem. Suppose that \( f_n: S \to \R \) is measurable for \( n \in \N_+ \) and that \( \lim_{n \to \infty} f_n \) exists on \( S \). Suppose also that \( \left|f_n\right| \le g \) for \( n \in \N \) where \( g: S \to [0, \infty) \) is absolutely integrable. Then \[ \int_S \lim_{n \to \infty} f_n \, d\mu = \lim_{n \to \infty} \int_S f_n \, d\mu \]

Details:

First note that by the increasing property, \( \int_S \left|f_n\right| \, d\mu \le \int_S g \, d\mu \lt \infty \) and hence \( f_n \) is integrable for \( n \in \N_+ \). Let \( f = \lim_{n \to \infty} f_n \). Then \( f \) is measurable, and by the increasing property again, \( \int_S \left| f \right| \, d\mu \lt \int_S g \, d\mu \lt \infty \), so \( f \) is integrable.

Now for \( n \in \N_+ \), let \( u_n = \inf\left\{f_k: k \in \{n, n + 1, \ldots\}\right\} \) and let \( v_n = \sup\left\{f_k: k \in \{n, n + 1, \ldots\}\right\} \). Then \( u_n \le f_n \le v_n \) for \( n \in \N_+ \), \( u_n \) is increasing in \( n \), \( v_n \) is decreasing in \( n \), and \( u_n \to f \) and \( v_n \to f \) as \( n \to \infty \). Moreover, \( \int_S u_1 \, d\mu \ge - \int_S g \, d\mu \gt -\infty \) so by the version of the monotone convergence theorem above, \( \int_S u_n \, d\mu \to \int_S f \, d\mu \) as \( n \to \infty \). Similarly, \( \int_S v_1 \, d\mu \lt \int_S g \, d\mu \lt \infty \), so by the monotone convergence theorem , \( \int_S v_n \, d\mu \to \int_S f \, d\mu \) as \( n \to \infty \). But by the increasing property, \( \int_S u_n \, d\mu \le \int_S f_n \, d\mu \le \int_S v_n \, d\mu \) for \( n \in \N_+ \) so by the squeeze theorem for limits, \( \int_S f_n \, d\mu \to \int_S f \, d\mu \) as \( n \to \infty \).

As you might guess, the assumption that \( \left| f_n \right| \) is uniformly bounded in \( n \) by an absolutely integrable function is critical. A counterexample when this assumption is missing is given in . The dominated convergence theorem remains true if \( \lim_{n \to \infty} f_n \) exists almost everywhere on \( S \). The follow corollary of the dominated convergence theorem gives a condition for the interchange of infinite sum and integral.

Suppose that \( f_i: S \to \R \) is measurable for \( i \in \N_+ \) and that \( \sum_{i=1}^\infty \left| f_i \right| \) is absolutely integrable. then \[ \int_S \sum_{i=1}^\infty f_i \, d\mu = \sum_{i=1}^\infty \int_S f_i \, d\mu \]

Details:

The assumption that \( g = \sum_{i = 1}^\infty \left| f_i \right| \) is integrable implies that \( g \lt \infty \) almost everywhere on \( S \). In turn, this means that \( \sum_{i=1}^\infty f_i \) is absolutely convergent almost everywhere on \( S \). Let \( f(x) = \sum_{i=1}^\infty f_i(x) \) if \( g(x) \lt \infty \), and for completeness, let \( f(x) = 0 \) if \( g(x) = \infty \). Since only the integral of \( f \) appears in the theorem, it doesn't matter how we define \( f \) on the null set where \( g = \infty \). Now let \( g_n = \sum_{i=1}^n f_i \). Then \( g_n \to f \) as \( n \to \infty \) almost everywhere on \( S \) and \( \left| g_n \right| \le g \) on \( S \). Hence by the dominated convergence theorem , \( \int_S g_n \, d\mu \to \int_S f \, d\mu \) as \( n \to \infty \). But we know the additivity property holds for finite sums, so \( \int_S g_n \, d\mu = \sum_{i=1}^n \int_S f_i \, d\mu \), and in turn this converges to \( \sum_{i=1}^\infty \int_S f_i \, d\mu \) as \( n \to \infty \). Thus we have \( \sum_{i=1}^\infty \int_S f_i \, d\mu = \int_S f \, d\mu \).

The following corollary of the dominated convergence theorem is known as the bounded convergence theorem.

Bounded Convergence Theorem. Suppose that \( f_n: S \to \R \) is measurable for \( n \in \N_+ \) and there exists \( A \in \ms S \) such that \( \mu(A) \lt \infty \), \( \lim_{n \to \infty} f_n \) exists on \( A \), and \( \left| f_n \right| \) is bounded in \( n \in \N_+ \) on \( A \). Then \[ \int_A \lim_{n \to \infty} f_n \, d\mu = \lim_{n \to \infty} \int_A f_n \, d\mu \]

Details:

Suppose that \( \left|f_n\right| \) is bounded in \( n \) on \( A \) by \( c \in (0, \infty) \). The constant \( c \) is integrable on \( A \) since \( \int_A c \, d\mu = c \mu(A) \lt \infty \), and \( \left|f_n\right| \le c \) on \( A \) for \( n \in \N_+ \). Thus the result follows from the dominated convergence theorem .

Again, the bounded convergence remains true if \( \lim_{n \to \infty} f_n \) exists almost everywhere on \( A \). For a finite measure space (and in particular for a probability space), the condition that \( \mu(A) \lt \infty \) automatically holds.

Product Spaces

Suppose now that \( (S, \ms S, \mu) \) and \( (T, \ms T, \nu) \) are \( \sigma \)-finite measure spaces. Recall the basic facts about the product \( \sigma \)-algebra \( \ms S \times \ms T \) of subsets of \( S \times T \), and the product measure \( \mu \times \nu \) on \( \ms S \times \ms T \). The product measure space \( (S \times T, \ms S \times \ms T, \mu \times \nu) \) is the standard one that we use for product spaces. If \( f: S \times T \to \R \) is measurable, there are three integrals we might consider. First, of course, is the integral of \( f \) with respect to the product measure \( \mu \times \nu \) \[ \int_{S \times T} f(x, y) \, d(\mu \times \nu)(x, y) \] sometimes called a double integral in this context. But also we have the nested or iterated integrals where we integrate with respect to one variable at a time: \[ \int_S \left(\int_T f(x, y) \, d\nu(y)\right) \, d\mu(x), \quad \int_T \left(\int_S f(x, y) d\mu(x)\right) \, d\nu(y)\] How are these integrals related? Well, just as in calculus with ordinary Riemann integrals, under mild conditions the three integrals are the same. The resulting important theorem is known as Fubini's Theorem in honor of the Italian mathematician Guido Fubini.

Fubini's Theorem. If \( f: S \times T \to \R \) is integrable with respect to \(\mu \times \nu\) then \[ \int_{S \times T} f(x, y) \, d(\mu \times \nu)(x, y) = \int_S \int_T f(x, y) \, d\nu(y) \, d\mu(x) = \int_T \int_S f(x, y) \, d\mu(x) \, d\nu(y) \]

Details:

We will show that \[ \int_{S \times T} f(x, y) \, d(\mu \times \nu)(x, y) = \int_S \int_T f(x, y) \, d\nu(y) \, d\mu(x) \] The proof with the other iterated integral is symmetric. The proof is a bootstrapping argument that proceeds in stages, paralleling the definition of the integral.

  1. Suppose that \( f = \bs{1}_{A \times B} \) where \( A \in \ms S \) and \( B \in \ms T \). The equation holds by definition of the product measure, since the double integral is \( (\mu \times \nu)(A \times B) \) and the iterated integral is \[ \int_S \int_T \bs{1}_{A \times B} (x, y) \, d\nu(y) \, d\nu(x) = \int_S \int_T \bs{1}_A(x) \bs{1}_B(y) \, d\nu(y) \, d\mu(x) \int_S \bs{1}_A(x) \nu(B) \, d\mu = \mu(A) \nu(B) \]
  2. Consider \( f = \bs{1}_C \) where \( C \in \ms S \times \ms T \). The double integral is \( (\mu \times \nu)(C) \), and so as a function of \( C \in \ms S \times \ms T \) defines the measure \( \mu \times \nu \). On the other hand, the iterated integral is \[ \int_S \int_T \bs{1}_C(x, y) \, d\nu(y) \, d\mu(x) = \int_S \int_T \bs{1}_{C_x}(y) \, d\nu(y) \, d\mu(x) = \int_S \nu(C_x) \, d\mu(x) \] where \( C_x = \{y \in T: (x, y) \in C\} \) is the cross-section of \( C \) at \( x \in S \). Recall from the section on existence and uniqueness that \( x \mapsto \nu(C_x) \) is a nonnegative, measurable function of \( x \), so \( C \mapsto \int_S \nu(C_x) \, d\mu(x) \) makes sense. Moreover, as a function of \( C \in \ms S \times \ms T \), this integral also forms a measure: If \( \{C^i: i \in I\} \) is a countable, disjoint collection sets in \( \ms S \times \ms T \), then \( \{C_x^i: i \in I\} \) is a countable, disjoint collection of sets in \( \ms T \). Cross-sections preserve set operations, so if \( C = \bigcup_{i \in I} C^i \) then \( C_x = \bigcup_{i \in I} C_x^i \). By the additivity of the measure \( \nu \) and the integral we have \[ \int_S \nu(C_x) \, d\mu(x) = \int_S \nu\left(\bigcup_{i \in I} C_x^i \right) \, d\mu(x) = \int_S \sum_{i \in I} \nu\left(C_x^i\right) \, d\mu(x) = \sum_{i \in I} \int_S \nu\left(C_x^i\right) \, d\mu(x)\] To summarize, the double integral and the iterated integral define positive measures on \( \ms S \times \ms T \). By (a), these measure agree on the measurable rectangles. By the uniqueness theorem, they must be the same measure. Thus the double integral and the iterated integral agree with integrand \( f = \bs{1}_C \) for every \( C \in \ms S \times \ms T \).
  3. Suppose \( f = \sum_{i \in I} c_i \bs{1}_{C_i} \) is a nonnegative simple function on \( S \times T \). Thus, \( I \) is a finite index set, \( c_i \in [0, \infty) \) for \( i \in I \), and \( \{C_i: i \in I\} \) is a disjoint collection of sets in \( \ms S \times \ms T \). The double integral and the iterated integral satisfy the linearity properties, and hence by (b), agree with integrand \( f \).
  4. Suppose that \( f: S \to [0, \infty) \) is measurable. Then there exists a sequence of nonnegative simple functions \( g_n, \; n \in \N_+ \) such that \( g_n \) is increasing in \( n \in \N_+ \) on \( S \times T \), and \( g_n \to f \) as \( n \to \infty \) on \( S \times T \). By the monotone convergence theorem, \( \int_{S \times T} g_n \, d(\mu \times \nu) \to \int_{S \times T} f \, d(\mu \times \nu) \). But for fixed \( x \in S \), \( y \mapsto g_n(x, y) \) is increasing in \( n \) on \( T \) and has limit \( f(x, y) \) as \( n \to \infty \). By another application of the montone convergence theorem, \( \int_T g_n(x, y) \, d\nu(y) \to \int_T f(x, y) \, d\nu(y) \) as \( n \to \infty \). But \(x \mapsto \int_T g_n(x, y) \, d\nu(y) \) is measurable and is increasing in \( n \in \N_+ \) on \( S \), so by yet another application of the monotone convergence theorem, \( \int_S \int_T g_n(x, y) \, d\nu(y) \, d\mu(x) \to \int_S \int_T f(x, y) \, d\nu(y) \, d\mu(x) \) as \( n \to \infty \). But the double integral and the iterated integral agree with integrand \( g_n \) by (c) for each \( n \in \N_+ \), so it follows that the double integral and the iterated integral agree with integrand \( f \).
  5. Suppose that \( f: S \times T \to \R \) is measurable. By (d), the double integral and the iterated integral agree with integrand functions \( f^+ \) and \( f^- \). Assuming that at least one of these is finite, then by the additivity property, they agree with integrand function \( f = f^+ - f^- \).

Of course, the double integral exists, and so Fubini's theorem applies, if either \( f \) is nonnegative or absolutely integrable with respect to \( \mu \times \nu \). When \( f \) is nonnegative, the result is sometimes called Tonelli's theorem in honor of another Italian mathematician, Leonida Tonelli. On the other hand, the iterated integrals may exist, and may be different, when \(f\) is not integrable with respect to \(\mu \times \nu\). A counterexample is given in and a second counterexample in .

A special case of Fubini's theorem (and indeed part of the proof) is that we can compute the measure of a set in the product space by integrating the cross-sectional measures.

If \( C \in \ms S \times \ms T \) then \[ (\mu \times \nu)(C) = \int_S \nu\left(C_x\right) \, d\mu(x) = \int_T \mu\left(C^y\right) \, d\nu(y) \] where \( C_x = \{y \in T: (x, y) \in C\} \) for \( x \in S \), and \( C^y = \{x \in S: (x, y) \in C\} \) for \( y \in T \).

In particular, if \( C, \; D \in \ms S \times \ms T \) have the property that \( \nu(C_x) = \nu(D_x) \) for all \( x \in S \), or \( \mu\left(C^y\right) = \mu\left(D^y\right) \) for all \( y \in T \) (that is, \( C \) and \( D \) have the same cross-sectional measures with respect to one of the variables), then \( (\mu \times \nu)(C) = (\mu \times \nu)(D) \). In \( \R^2 \) with area, and in \( \R^3 \) with volume (Lebesgue measure in both cases), this is known as Cavalieri's principle, named for Bonaventura Cavalieri, yet a third Italian mathematician. Clearly, Italian mathematicians cornered the market on theorems of this sort.

A simple corollary of Fubini's theorem is that the double integral of a product function over a product set is the product of the integrals. This result has important applications in probability to random variables that are independent, as we will see in the chapter on probability spaces.

Suppose that \( g: S \to \R \) and \( h: T \to \R \) are measurable, and either both are nonnegative or both are absolutely integrable. Then \[ \int_{S \times T} g(x) h(y) \, d(\mu \times \nu)(x, y) = \left(\int_S g(x) \, d\mu(x)\right) \left(\int_T h(y) \, d\nu(y)\right) \]

Details:

Define \(f: S \times T \to \R\) by \(f(x, y) = g(x) h(y)\) for \((x, y) \in S \times T\). Then \(f\) is measurable. If \(g\) and \(h\) are nonnegative, then so is \(f\) and hence \(f\) is integrable. If \(g\) and \(h\) are absolutely integrable then by Fubini's theorem applied to \(|f|\) we have \begin{align*} \int_{S \times T} |f(x, y)| \, d(\mu \times \nu)(x, y) & = \int_T\left(\int_S |g(x)| |h(y)| \, d\mu(x)\right) \, d\nu(y) = \int_T |h(y)| \left(\int_S |g(x)| \, d\mu(x)\right) \, d\nu(y) \\ & = \left(\int_S |g(x)| \, d\mu(x)\right) \left(\int_T |h(y)| \, d\nu(y)\right) \lt \infty \end{align*} Hence \(f\) is absolutely integrable. So in either case, applying Fubini's theorem to \(f\), with the same computations as above we have \begin{align*} \int_{S \times T} f(x, y) \, d(\mu \times \nu)(x, y) & = \int_T\left(\int_S g(x) h(y) \, d\mu(x)\right) \, d\nu(y) = \int_T h(y) \left(\int_S g(x) \, d\mu(x)\right) \, d\nu(y) \\ & = \left(\int_S g(x) \, d\mu(x)\right) \left(\int_T h(y) \, d\nu(y)\right) \end{align*}

Recall that a discrete measure space consists of a countable set with the \( \sigma \)-algebra of all subsets and with counting measure. In such a space, integrals are simply sums and so Fubini's theorem allows us to rearrange the order of summation in a double sum.

Suppose that \( I \) and \( J \) are countable and that \( a_{i j} \in \R \) for \( i \in I \) and \( j \in J \). If the sum of the positive terms or the sum of the negative terms is finite, then \[ \sum_{(i, j) \in I \times J} a_{i j} = \sum_{i \in I} \sum_{j \in J} a_{i j} = \sum_{j \in J} \sum_{i \in I} a_{i j} \]

Often \( I = J = \N_+ \), and in this case, \( a_{i j} \) can be viewed as an infinite array, with \( i \in \N_+ \) the row number and \( j \in \N_+ \) the column number:

\( a_{11} \) \( a_{12} \) \( a_{13} \) \( \ldots \)
\( a_{21} \) \( a_{22} \) \( a_{23} \) \( \ldots \)
\( a_{31} \) \( a_{32} \) \( a_{33} \) \( \ldots \)
\( \vdots \) \( \vdots \) \( \vdots \) \( \vdots \)

The significant point is that \( \N_+ \) is totally ordered. While there is no implied order of summation in the double sum \( \sum_{(i, j) \in \N_+^2} a_{i j} \), the iterated sum \( \sum_{i=1}^\infty \sum_{j=1}^\infty a_{i j} \) is obtained by summing over the rows in order and then summing the results by column in order, while the iterated sum \( \sum_{j=1}^\infty \sum_{i=1}^\infty a_{i j} \) is obtained by summing over the columns in order and then summing the results by row in order.

Of course, only one of the product spaces might be discrete. Propositions and on the interchange of sum and integral can be viewed as applications of Fubini's theorem, where one of the measure spaces is \( (S, \ms S, \mu) \) and the other is \( \N_+ \) with counting measure.

Examples and Applications

Counterexamples

In the first three exercises below, \( (\R, \ms R, \lambda) \) is the standard one-dimensional Euclidean space, so \( \ms R \) is \( \sigma \)-algebra of Borel measurable sets and \( \lambda \) is Lebesgue measure.

Let \( f = \bs{1}_{[1, \infty)} \) and \( g = \bs{1}_{[0, \infty)} \). Show that

  1. \( f \le g \) on \( \R \)
  2. \( \lambda\{x \in \R: f(x) \lt g(x)\} = 1 \)
  3. \( \int_\R f \, d\lambda = \int_\R g \, d\lambda = \infty \)

This example shows that the strict increasing property in can fail when the integrals are infinite.

Let \( f_n = \bs{1}_{[n, \infty)} \) for \( n \in \N_+ \). Show that

  1. \( f_n \) is decreasing in \( n \in \N_+ \) on \( \R \).
  2. \( f_n \to 0 \) as \( n \to \infty \) on \( \R \).
  3. \( \int_\R f_n \, d\lambda = \infty \) for each \( n \in \N_+ \).

This example shows that the monotone convergence theorem can fail if the first integral is infinite. It also illustrates strict inequality in Fatou's lemma .

Let \( f_n = \bs{1}_{[n, n + 1]} \) for \( n \in \N_+ \). Show that

  1. \(\lim_{n \to \infty} f_n = 0 \) on \( \R \) so \( \int_\R \lim_{n \to \infty} f_n \, d\mu = 0 \)
  2. \( \int_\R f_n \, d\lambda = 1 \) for \( n \in \N_+ \) so \( \lim_{n \to \infty} \int_\R f_n \, d\lambda = 1\)
  3. \( \sup\{f_n: n \in \N_+\} = \bs{1}_{[1, \infty)} \) on \( \R \)

This example shows that the dominated convergence theorem can fail if \( \left|f_n\right| \) is not bounded by an integrable function. It also shows that strict inequality can hold in Fatou's lemma .

Consider the product space \( [0, 1]^2 \) with the usual Borel measurable subsets and Lebesgue measure. Let \( f: [0, 1]^2 \to \R \) be defined by \[ f(x, y) = \frac{x^2 - y^2}{(x^2 + y^2)^2} \] Show that

  1. \( \int_{[0, 1]^2} f(x, y) \, d(x, y) \) does not exist.
  2. \( \int_0^1 \int_0^1 f(x, y) \, dx \, dy = -\frac{\pi}{4} \)
  3. \( \int_0^1 \int_0^1 f(x, y) \, dy \, dx = \frac{\pi}{4} \)

This example shows that Fubini's theorem can fail if the integrand is not integrable with respect to the product measure.

For \( i, \, j \in \N_+ \) define the sequence \( a_{i j} \) as follows: \( a_{i i} = 1 \) and \( a_{i + 1, i} = -1 \) for \( i \in \N_+ \), \( a_{i j} = 0 \) otherwise.

  1. Give \( a_{i j} \) in array form with \( i \in \N_+ \) as the row number and \( j \in \N_+ \) as the column number
  2. Show that \( \sum_{(i, j) \in \N_+^2} a_{i j} \) does not exist
  3. Show that \( \sum_{i = 1}^\infty \sum_{j = 1}^\infty a_{i j} = 1 \)
  4. Show that \( \sum_{j=1}^\infty \sum_{i=1}^\infty a_{i j} = 0 \)

This example shows that the iterated sums can exist and be different when the double sum does not exist, a counterexample to corollary of Fubini's theorem, for sums when the hypotheses are not satisfied.

Computational Exercises

Compute \( \int_D f(x, y) \, d(x,y) \) in each case below for the given \( D \subseteq \R^2 \) and \( f: D \to \R \).

  1. \( f(x, y) = e^{-2 x} e^{-3 y} \), \( D = [0, \infty) \times [0, \infty) \)
  2. \(f(x, y) = e^{-2 x} e^{-3 y} \), \( D = \{(x, y) \in \R^2: 0 \le x \le y \lt \infty\} \)

Integrals of the type in are useful in the study of exponential distributions.