1. Random
  2. 1. Probability Spaces
  3. 1
  4. 2
  5. 3
  6. 4
  7. 5
  8. 6
  9. 7
  10. 8
  11. 9

8. Stochastic Processes

Introduction

This section requires measure theory, so you may need to review the chapter on foundations, particularly the sections on topology, measurable spaces, and positive measures. First, recall that a set E almost always comes with a σ-algebra E of admissible subsets, so that (E,E) is a measurable space. Usually in fact, E has a topology and E is the corresponding Borel σ-algebra, that is, the σ-algebra generated by the topology. If E is countable, we almost always take E to be the collection of all subsets of E, and in this case (E,E) is a discrete space. The other common case is when E is a measurable subset of Rn for some nN, in which case E is the collection of measurable subsets of E. If (E1,E1),(E2,E2),,(En,En) are measurable spaces for some nN+, then the Cartesian product E1×E2××En is given the product σ-algebra E1×E2××En. As a special case, the Cartesian power En is given the corresponding power σ-algebra En.

With these preliminary remarks out of the way, suppose that (Ω,F,P) is a probability space so that Ω is the set of outcomes, F the σ-algebra of events, and P is the probability measure on the sample space (Ω,F). Suppose also that (S,S) and (T,T) are measurable spaces. Here is our main definition:

A random process or stochastic process on (Ω,F,P) with state space (S,S) and index set T is a collection of random variables X={Xt:tT} such that Xt takes values in S for each tT.

Sometimes it's notationally convenient to write X(t) instead of Xt for tT. Often T=N or T=[0,) and the elements of T are interpreted as points in time (discrete time in the first case and continuous time in the second). So then XtS is the state of the random process at time tT, and the index space (T,T) becomes the time space.

Since Xt is itself a function from Ω into S, it follows that ultimately, a stochastic process is a function from Ω×T into S. Stated another way, tXt is a random function on the probability space (Ω,F,P). To make this precise, recall that ST is the notation sometimes used for the collection of functions from T into S. Recall also that a natural σ-algebra used for ST is the one generated by sets of the form {fST:f(t)At for all tT}, where AtS for every tT and At=S for all but finitely many tT This σ-algebra, denoted ST, generalizes the ordinary power σ-algebra Sn mentioned in the opening paragraph and will be important in the discussion of existence below.

Suppose that X={Xt:tT} is a stochastic process on the probability space (Ω,F,P) with state space (S,S) and index set T. Then the mapping that takes ω into the function tXt(ω) is measurable with respect to (Ω,F) and (ST,ST).

Details:

Recall that a mapping with values in ST is measurable if and only if each of its coordinate functions is measurable. In the present context that means that we must show that the function Xt is measurable with respect to (Ω,F) and (S,S) for each tT. But of course, that follows from the very meaning of the term random variable.

For ωΩ, the function tXt(ω) is known as a sample path of the process. So ST, the set of functions from T into S, can be thought of as a set of outcomes of the stochastic process X, a point we will return to in our discussion of existence below.

As noted in the proof [2], Xt is a measurable function from Ω into S for each tT, by the very meaning of the term random variable. But it does not follow in general that (ω,t)Xt(ω) is measurable as a function from Ω×T into S. In fact, the σ-algebra on T has played no role in our discussion so far. Informally, a statement about Xt for a fixed tT or even a statement about Xt for countably many tT defines an event. But it does not follow that a statement about Xt for uncountably many tT defines an event. We often want to make such statements, so the following definition is inevitable:

A stochastic process X={Xt:tT} defined on the probability space (Ω,F,P) and with index space (T,T) and state space (S,S) is measurable if (ω,t)Xt(ω) is a measurable function from Ω×T into S.

Every stochastic process indexed by a countable set T is measurable, so the definition is only important when T is uncountable, and in particular for T=[0,).

Equivalent Processes

Our next goal is to study different ways that two stochastic processes, with the same state and index spaces, can be equivalent, so you may need to review equivalence relations. We will assume that the diagonal D={(x,x):xS}S2, an assumption that almost always holds in applications, and in particular for the discrete and Euclidean spaces that are most important to us. Sufficient conditions are that S have a sub σ-algebra that is countably generated and contains all of the singleton sets, properties that hold for the Borel σ-algebra when the topology on S is locally compact, Hausdorff, and has a countable base.

First, we often feel that we understand a random process X={Xt:tT} well if we know the finite dimensional distributions, that is, if we know the distribution of (Xt1,Xt2,,Xtn) for every choice of nN+ and (t1,t2,,tn)Tn. Thus, we can compute P[(Xt1,Xt2,,Xtn)A] for every nN+, (t1,t2,,tn)Tn, and ASn. Using various rules of probability, we can compute the probabilities of many events involving infinitely many values of the index parameter t as well. With this idea in mind, we have the following definition:

Random processes X={Xt:tT} and Y={Yt:tT} with state space (S,S) and index set T are equivalent in distribution if they have the same finite dimensional distributions. This defines an equivalence relation on the collection of stochastic processes with this state space and index set. That is, if X, Y, and Z are such processes then

  1. X is equivalent in distribution to X (the reflexive property)
  2. If X is equivalent in distribution to Y then Y is equivalent in distribution to X (the symmetric property)
  3. If X is equivalent in distribution to Y and Y is equivalent in distribution to Z then X is equivalent in distribution to Z (the transitive property)

Note that since only the finite-dimensional distributions of the processes X and Y are involved in the definition, the processes need not be defined on the same probability space. Thus, equivalence in distribution partitions the collection of all random processes with a given state space and index set into mutually disjoint equivalence classes. But of course, we already know that two random variables can have the same distribution but be very different as variables (that is, as functions on the sample space). Clearly, the same statement applies to random processes.

Suppose that X is a sequence of independent indicator random variables with P(Xn=1)=12 for each nN+. Let Yn=1Xn for nN+. Then Y=(Y1,Y2,) is equivalent in distribution to X but P(XnYn for every nN+)=1

Details:

The state set is {0,1} and Yn=1 if and only if Xn=0. Hence Y is also a sequence of independent indicator variables with P(Yn=1)=12 for each nN+, and so X and Y are equivalent in distribution.

In example [5], X and Y are sequences of Bernoulli trials with success parameter p=12.

Motivated by this example, let's look at another, stronger way that random processes can be equivalent. First recall that random variables X and Y on (Ω,F,P), with values in S, are equivalent if P(X=Y)=1.

Suppose that X={Xt:tT} and Y={Yt:tT} are stochastic processes defined on the same probability space (Ω,F,P) and both with state space (S,S) and index set T. Then Y is a versions of X if Yt is equivalent to Xt (so that P(Xt=Yt)=1) for every tT. This defines an equivalence relation on the collection of stochastic processes on the same probability space and with the same state space and index set. That is, if X, Y, and Z are such processes then

  1. X is a version of X (the reflexive property)
  2. If X is a version of Y then Y is ia version of X (the symmetric property)
  3. If X is a version of Y and Y is of Z then X is a version of Z (the transitive property)
Details:

Note that (Xt,Yt) is a random variable with values in S2 (and so the function ω(Xt(ω),Yt(ω)) is measurable). The event {Xt=Yt} is the inverse image of the diagonal DS2 under this mapping, and so the definition makes sense.

So the version of relation partitions the collection of stochastic processes on a given probability space and with a given state space and index set into mutually disjoint equivalence classes.

Suppose again that X={Xt:tT} and Y={Yt:tT} are random processes on (Ω,F,P) with state space (S,S) and index set T. If Y is a version of X then Y and X are equivalent in distribution.

Details:

Suppose that (t1,t2,,tn)Tn and that ASn. Recall that the intersection of a finite (or even countably infinite) collection of events with probability 1 still has probability 1. Hence P[(Xt1,Xt2,,Xtn)A]=P[(Xt1,Xt2,,Xtn)A,Xt1=Yt1,Xt2=Yt2,,Xtn=Ytn]=P[(Yt1,Yt2,,Ytn)A,Xt1=Yt1,Xt2=Yt2,,Xtn=Ytn]=P[(Yt1,Yt2,,Ytn)A]

As noted in the proof, a countable intersection of events with probability 1 still has probability 1. Hence if T is countable and random processes X is a version of Y then P(Xt=Yt for all tT)=1 so X and Y really are essentially the same random process. But when T is uncountable the result in the displayed equation may not be true, and X and Y may be very different as random functions on T. Here is a simple example:

Suppose that Ω=T=[0,), F=T is the σ-algebra of Borel measurable subsets of [0,), and P is any continuous probability measure on (Ω,F). Let S={0,1} (with all subsets measurable, of course). For tT and ωΩ, define Xt(ω)=1t(ω) and Yt(ω)=0. Then X={Xt:tT} is a version of Y={Yt:tT}, but P(Xt=Yt for all tT}=0.

Details:

For t[0,), P(XtYt)=P{t}=0 since P is a continuous measure. But {ωΩ:Xt(ω)=Yt(ω) for all tT}=.

Motivated by example [8], we have our strongest form of equivalence:

Suppose that X={Xt:tT} and Y={Yt:tT} are measurable random processes on the probability space (Ω,F,P) and with state space (S,S) and index space (T,T). Then X is indistinguishable from Y if P(Xt=Yt for all tT)=1. This defines an equivalence relation on the collection of measurable stochastic processes defined on the same probability space and with the same state and index spaces. That is, if X, Y, and Z are such processes then

  1. X is indistinguishable from X (the reflexive property)
  2. If X is indistinguishable from Y then Y is indistinguishable from X (the symmetric property)
  3. If X is indistinguishable from Y and Y is indistinguishable from Z then X is indistinguishable from Z (the transitive property)
Details:

The measurability requirement for the stochastic processes is needed to ensure that {Xt=Yt for all tT} is a valid event. To see this, note that (ω,t)(Xt(ω),Yt(ω)) is measurable, as a function from Ω×T into S2. As before, let D={(x,x):xS} denote the diagonal. Then DcS2 and the inverse image of Dc under our mapping is {(ω,t)Ω×T:Xt(ω)Yt(ω)}F×T The projection of this set onto Ω {ωΩ:Xt(ω)Yt(ω) for some tT}F since the projection of a measurable set in the product space is also measurable. Hence the complementary event {ωΩ:Xt(ω)=Yt(ω) for all tT}F

So the indistinguishable from relation partitions the collection of measurable stochastic processes on a given probability space and with given state space and index space into mutually disjoint equivalence classes. Trivially, if X is indistinguishable from Y, then X is a version of Y. As noted above, when T is countable, the converse is also true, but not, as example [8] shows, when T is uncountable. So to summarize, indistinguishable from implies version of implies equivalent in distribution, but none of the converse implications hold in general.

The Kolmogorov Construction

In applications, a stochastic process is often modeled by giving various distributional properties that the process should satisfy. So the basic existence problem is to construct a process that has these properties. More specifically, how can we construct random processes with specified finite dimensional distributions? Let's start with the simplest case, one that we have seen several times before, and build up from there. Our simplest case is to construct a single random variable with a specified distribution.

Suppose that (S,S,P) is a probability space. Then there exists a random variable X on a probability space (Ω,F,P) such that X takes values in S and has distribution P.

Details:

The proof is utterly trivial. Let (Ω,F,P)=(S,S,P) and define X:ΩS by X(ω)=ω, so that X is the identity function. Then {XA}=A and so P(XA)=P(A) for AS.

In spite of its triviality the last result contains the seeds of everything else we will do in this discussion. Next, let's see how to construct a sequence of independent random variables with specified distributions.

Suppose that Pi is a probability measure on the measurable space (S,S) for iN+. Then there exists an independent sequence of random variables (X1,X2,) on a probability space (Ω,F,P) such that Xi takes values in S and has distribution Pi for iN+.

Details:

Let Ω=S=S×S×. Next let F=S, the corresponding product σ-algebra. Recall that this is the σ-algebra generated by sets of the form A1×A2× where AiS for each iI and Ai=S for all but finitely many iI Finally, let P=P1×P2×, the corresponding product measure on (Ω,F). Recall that this is the unique probability measure that satisfies P(A1×A2×)=P1(A1)P2(A2) where A1×A2× is a set of the type in the first displayed equation. Now define Xi on Ω by Xi(ω1,ω2,)=ωi, for iN+, so that Xi is simply the coordinate function for index i. If A1×A2× is a set of the type in the first displayed equation then {X1A1,X2A2,}=A1×A2× and so by the definition of the product measure, P(X1A1,X2A2,)=P1(A1)P2(A2) It follows that (X1,X2,) is a sequence of independent variables and that Xi has distribution Pi for iN.

If you looked at the proof of the last two results you might notice that the last result can be viewed as a special case of the one before, since X=(X1,X2,) is simply the identity function on Ω=S. The important step is the existence of the product measure P on (Ω,F).

The full generalization of these results is known as the Kolmogorov existence theorem (named for Andrei Kolmogorov). We start with the state space (S,S) and the index set T. The theorem states that if we specify the finite dimensional distributions in a consistent way, then there exists a stochastic process defined on a suitable probability space that has the given finite dimensional distributions. The consistency condition is a bit clunky to state in full generality, but the basic idea is very easy to understand. Suppose that s and t are distinct elements in T and that we specify the distribution (probability measure) Ps of Xs, Pt of Xt, Ps,t of (Xs,Xt), and Pt,s of (Xt,Xs). Then clearly we must specify these so that Ps(A)=Ps,t(A×S),Pt(B)=Ps,t(S×B) For all A,BS. Clearly we also must have Ps,t(C)=Pt,s(C) for all measurable CS2, where C={(y,x):(x,y)C}.

To state the consistency conditions in general, we need some notation. For nN+, let T(n)Tn denote the set of n-tuples of distinct elements of T, and let T=n=1T(n) denote the set of all finite sequences of distinct elements of T. If nN+, t=(t1,t2,,tn)T(n) and π is a permutation of {1,2,,n}, let tπ denote the element of T(n) with coordinates (tπ)i=tπ(i). That is, we permute the coordinates of t according to π. If CSn, let πC={(x1,x2,,xn)Sn:(xπ(1),xπ(2),,xπ(n))C}Sn finally, if n>1, let t denote the vector (t1,t2,,tn1)T(n1)

Now suppose that Pt is a probability measure on (Sn,Sn) for each nN+ and tT(n). The idea, of course, is that we want the collection P={Pt:tT} to be the finite dimensional distributions of a random process with index set T and state space (S,S). Here is the critical definition:

The collection of probability distributions P relative to T and (S,S) is consistent if

  1. Ptπ(C)=Pt(πC) for every nN+, tT(n), permutation π of {1,2,,n}, and measurable CSn.
  2. Pt(C)=Pt(C×S) for every n>1, tT(n), and measurable CSn1

With the proper definition of consistence, we can state the fundamental theorem.

Kolmogorov Existence Theorem. If P is a consistent collection of probability distributions relative to the index set T and the state space (S,S), then there exists a probability space (Ω,F,P) and a stochastic process X={Xt:tT} on this probability space such that P is the collection of finite dimensional distribution of X.

Details:

Let Ω=ST, the set of functions from T to S. Such functions are the outcomes of the stochastic process. Let F=ST, the product σ-algebra, generated by sets of the form B={ωΩ:ω(t)At for all tT} where AtS for all tT and At=S for all but finitely many tT. We know how our desired probability measure P should work on the sets that generate F. Specifically, suppose that B is a set of the type in the displayed equation, and At=S except for t=(t1,t2,,tn)T(n). Then we want P(B)=Pt(At1×At2××Atn) Basic existence and uniqueness theorems, and the consistency of P, guarantee that P can be extended to a probability measure on all of F. Finally, for tT we define Xt:ΩS by Xt(ω)=ω(t) for ωΩ, so that Xt is simply the coordinate function of index t. Thus, we have a stochastic process X={Xt:tT} with state space (S,S), defined on the probability space (Ω,F,P), with P as the collection of finite dimensional distributions.

Note that except for the more complicated notation, the construction is very similar to the one for a sequence of independent variables in [11]. Again, X is essentially the identity function on Ω=ST. The important and more difficult part is the construction of the probability measure P on (Ω,F).

Applications

Our last discussion is a summary of the stochastic processes that are studied in this text. All are classics and are immensely important in applications.

Random processes are associated with Bernoulli trials, include

  1. The sequence of Bernoulli variables
  2. The sequence of binomial variables
  3. The sequence of geometric variables
  4. The sequence of negative binomial variables
  5. The simple random walk
Details:

The Bernoulli trials sequence in (a) is a sequence of independent, identically distributed indicator random variables, and so can be constructed as in [11]. The random processes in (b)–(e) are constructed from the Bernoulli trials sequence.

Random process associated with the Poisson model, include

  1. The sequence of inter-arrival times
  2. The sequence of arrival times
  3. The counting process on [0,), both in the homogenous case and the non-homogeneous case.
  4. The compound Poisson process.
  5. The counting process on a general measure space
Details:

The random process in (a) is a sequence of independent random variable with a common exponential distribution, and so can be constructed as in [11]. The processes in (b) and (c) can be constructed from the sequence in (a).

Random processes associated with renewal theory include

  1. The sequence of inter-arrival times
  2. The sequence of arrival times
  3. The counting process on [0,)

Markov processes are a very important family of random processes as are processes associated with Brownian motion.