This section studies how the distribution of a random variable changes when the variable is transfomred in a deterministic way. If you are a new student of probability, you should skip the technical details.
Basic Theory
The Problem
As usual, our starting point is a random experiment, modeled by a probability space. So to review, is the set of outcomes, is the collection of events, and is the probability measure on the sample space . Suppose now that and are measurable spaces and that is a random variable for the experiment, with values in . If is measurable function, then is a new random variable with values in . If the distribution of is known, how do we find the distribution of ? This is a very basic and important question. In applications, random variables are variables or measurements of interest in the experiment, and transformations of such variables are ubiquitous. In a superficial sense, the solution is easy, but first recall that the inverse image of under is .
for .
Details:
for . This is just the probability version of the general theorem that a measurable function on a measure space produces a new measure on the range space.
A function . How is a probability distribution on transformed by to a distribution on ?
Suppose now that and are -finite positive measures for and . Frequently the distribution of is known through its probability density function , and we would similarly like to find the probability density function of . This is a difficult problem in general, because as we will see, even simple transformations of variables with simple distributions can lead to variables with complex distributions. We will solve the problem in various special cases. As always, density functions are with respect to the associated reference measures, and the two most important special cases are discrete measure spaces and Euclidean measure spaces as described in the introduction.
Transformed Variables with Discrete Distributions
When the transformed variable has a discrete distribution, the probability density function of can be computed using basic rules of probability.
Suppose that has a discrete distribution, and that has probability density function . Then has probability density function given by
Details:
The hypothesis means that is countable, is the collection of all subsets of , and the reference measure is counting measure . So by definition of the densities of and ,
Here are two special cases for the distribution of .
As a special case, if also has a discrete distribution, then the integral in [2] is a sum and
A transformation of a discrete probability distribution.
As another special case, suppose that has a continuous distribution on an -dimensional Euclidean space for some . Then the integral in [2] is the Lebesgue integral and
Details:
The assumptin that has a density function actually means that is absolutely continuous with respect to Lebesgue measure. Recall also that the Lebesgue integral agrees with the ordinary Riemann integral of calculus for the density functions that usually occur in applications.
A continuous distribution on transformed by a discrete function
So the main problem is often computing the inverse images for . The formulas in the discrete in [3] and the Euclidean case in [4] cases are not worth memorizing explicitly; it's usually better to just work each problem from scratch. The main step is to write the event in terms of , and then find the probability of this event using the probability density function of .
Transformed Variables in
Suppose now that for some . In some cases, the distribution function of can be found by using basic rules of probability. Then the probability density function of , if it exists, can be found form the distribution function. This general method is referred to, appropriately enough, as the distribution function method. Here is the case with .
Suppose that is real valued. The distribution function of is given by
Details:
Again, this follows from the definition of as a density of . For ,
As in the discrete case, the formula above not much help, and it's usually better to work each problem from scratch. The main step is to write the event in terms of , and then find the probability of this event using the probability density function of .
The Change of Variables Formula
Suppose now that for some , and suppose that has a continuous distribution probability density function . When the transformation is one-to-one and smooth, there is a formula for the probability density function of directly in terms of . This is known as the change of variables formula. Note that since is one-to-one, it has an inverse function. We will explore the one-dimensional case first, where the concepts and formulas are simplest. Thus, suppose that is an interval, and that has a continuous distribution on with probability density function . Suppose where is a one-to-one, continuously differentiable function from onto an interval . Clearly is either strictly increasing or strictly decreasing.
If is strictly increasing then has a continuous distribution on with density function given by
Details:
Let denote the distribution function of . Then
Note that the inquality is preserved since is increasing. Taking derivatives with respect to and using the chain rule gives the result. Recall that at the points of continuity of .
If is strictly decreasing then has a continuous distribution on with density function given by
Details:
Again let denote the distribution function of . Then
Note that the inquality is reversed since is decreasing. Taking derivatives with respect to and using the chain rule gives the result. Recall again that at the points of continuity of .
The formulas for the probability density functions in the increasing case in propostion [6] and the decreasing case in [7] can be combined:
Under the assumptions on , the probability density function of is given by
Letting , the change of variables formula can be written more compactly as or even in differential form as . Although succinct and easy to remember, the formulas are a bit less clear. It must be understood that on the right should be written in terms of via the inverse function. The images below give a graphical interpretation of the formula in the two cases where is increasing and where is decreasing.
The change of variables theorems in the increasing and decreasing cases
The generalization of this result to dimension is basically a theorem in multivariate calculus. First we need some notation. Suppose and are open subsets of and that is a one-to-one, continuously differentiable fucntion from onto . We will use the notation and . The first derivative of the inverse function is the matrix of first partial derivatives: . The Jacobian (named in honor of Karl Gustav Jacobi) of the inverse function is the determinant of the first derivative matrix .
With this compact notation, the multivariate change of variables formula is easy to state.
Suppose that has a continuous distribution on with probability density function . Then has a continuous distribution on with density function given by
Details:
The result follows from the multivariate change of variables formula in calculus. The probability distribution of is given by
This is just a special case of [1], and only requires that be measurable. The next step is to use the change of variables , . Under the assumptions on ,
So it follows that defined in the theorem is a density function for .
The Jacobian is the infinitesimal scale factor that describes how -dimensional volume changes under the transformation, as illustrated in the graphic below.
The multivariate change of variables theorem
Special Transformations
Linear Transformations
Linear transformations (or more technically affine transformations) are among the most common and important transformations. Moreover, this type of transformation leads to simple applications of the change of variable theorems. Suppose first that has a continuous distribution on an interval with probability density function . Let where and . Note that has values in the interval .
has a continuous distribution on with probability density function given by
Details:
The transformation is so the inverse transformation is and . The result now follows from the change of variables [8].
When (which is often the case in applications), this transformation is known as a location-scale transformation; is the location parameter and is the scale parameter. Scale transformations arise naturally when physical units are changed (from feet to meters, for example). Location transformations arise naturally when the physical reference point is changed (measuring time relative to 9:00 AM as opposed to 8:00 AM, for example). The change of temperature measurement from Fahrenheit to Celsius is a location and scale transformation.
The multivariate version of this result has a simple and elegant form when the linear transformation is expressed in matrix-vector form. Thus suppose that has a continuous distribution on with probability density function . Let where and is an invertible matrix. Note that has values in .
has a continuous distribution on with probability density function given by
Details:
The transformation maps one-to-one and onto . The inverse transformation is . The Jacobian of the inverse transformation is the constant function . The result now follows from the change of variables [9].
Sums and Convolution
Simple addition of independent real-valued random variables is perhaps the most important of all transformations. Indeed, much of classical probability theory is concerned with sums of independent variables, in particular the law of large numbers, and the central limit theorem. In addition, aside from myriad applications, in many cases a variable with a special distribution can be decomposed inot a sum of independent random variables with simpler distributions. We will consider sums of independent variables in terms of their probability distributions, distribution functions, and density functions. Each case leads to a type of convolution.
Suppose that and are independent, real-value random variables with distributions and respectively. Then the distribution of is the convolution of and defined by
Details:
Since and are independent, the distribution of on is the product probability measure .
Writing the double integral as an iterated integral is justified by Fubini's theorem. A more elegant and intutive proof can be given in terms of expected value.
Since addition of independent random variables is trivially commutative and assoicative, the same is true of the convolution of probability measures on .
Suppose that , , and are probability measures on . Then
Suppose that and are independent, real-valued random variables with distribution functions and respectively. The distribution function of is the convoluation of and , defined by
Details:
Suppose that . If then . Hence from the first convolution result in [12],
The last equation is simply notation: is the Lebesgue-Stieljes integral associated with the distribution function , so an integral with respect to means the same thing as an integral with respect ot .
Convolution of probability distribution functions is commutative and associative.
Suppose that , , and are distribution functions on . Then
If the independent random variables and are of the same type (either discrete or continuous) then we can find the density function of in terms of the density functions of and .
Suppose that and are independent, real-valued random variables.
If and have discrete distributions with density functions and (with respect to counting measure) then the density function is the (discrete) convolution of and defined by
If and have continuous distributions with density functions and (with respect to Lebesgue measure) then the density function of is the (continuous) convolution define by
Details:
Both results follow from the previous convolution results, but it's also instructive to give direct proofs.
Using independence and simple rules of probability,
Of course, the sum is actually over the countable set
We can use the linear transformation in [11]. The joint density of is . The linear transformation we need is , with inverse . The Jacobian of the inverse transformation is 1. Hence the density function of is . Integrating the joint density over gives the density of .
The formulas are simple and elegant, but gloss over domain complications. Suppose that and have support sets and , respectively. In part (a), as noted in the proof, and are countable and the sum is actually over the countable set . In part (b), and are typically intervals, and again, the integration is over the set . Determining this set for each possible value of is often the most challenging part. However, when the variables are nonnegative, the complications are much reduced.
Suppose again that and are independent random variables.
If and have discrete distributions on , with probability density functions and , then has probability density function given by
If and have continuous distributions on with probability density functions and then and has probability density function given by
Details:
In this case, for .
In this case, for .
Convolution of density functions, like the two other forms of convolution, is commutative and associative.
Suppose that , , and are probability density functions of the same type. Then
We have now seen three types of convolution—one for probability measures, one for distribution functions, and one for probabiity density functions. One of the takeaways is that convolution means different things depending on the type of objects involved. Moreover, convolution is a very important mathematical operation that occurs in areas of mathematics outside of probability.
Suppose that is a real-valued random variable and that is a sequence of independent copies of . Let for .
If denotes the probability distribution of then had distribution , the convolution power of of order .
If denotes the distribution function of then has distribution function , the convolution power of of order .
If has either a discrete or continuous distribution with probability density function then has density function , the convolution power of of order .
In statistical terms, corresponds to sampling from the distribution of . When appropriately scaled and centered, the distribution of converges to the standard normal distribution as . The precise statement of this result is the central limit theorem, one of the fundamental theorems of probability.
Products and Quotients
While not as important as sums, products and quotients of independent, real-valued random variables also occur frequently. We will limit our discussion to positive random variables with continuous distributions, the most important case.
Suppose that and are independent and have continuous distributions on with probability density functions and , respectively.
Random variable has a continuous distribution on with probability density function given by
Random variable has a continuous distribution on with probability density function given by
Details:
First recall that has probability density function on . We introduce the auxiliary variable so that we have bivariate transformations and can use our change of variables formula [9].
We have the transformation , and so the inverse transformation is , . Hence
and so the Jacobian is . Using the change of variables theorem, the density function of is . Hence the denstiy of is
We have the transformation , and so the inverse transformation is , . Hence
and so the Jacobian is . Using the change of variables formula, the density function of is . Hence the density function of W is
If is supported on a subset and is supported on a subset , then for a given , the integral in (a) is over , and for a given , the integral in (b) is over . As with convolution, determining the domain of integration is often the most challenging step.
Minimum and Maximum
Suppose that and that is a sequence of independent real-valued random variables. The minimum and maximum transformations
are very important in a number of applications. For example, recall that in the standard model of structural reliability, a system consists of components that operate independently. Suppose that represents the lifetime of component . Then is the lifetime of the series system which operates if and only if each component is operating. Similarly, is the lifetime of the parallel system which operates if and only if at least one component is operating.
A particularly important special case occurs when the random variables are identically distributed, in addition to being independent. In this case, the sequence of variables is a random sample of size from the common distribution. The minimum and maximum variables are the extreme examples of order statistics. Our first result is in terms of distributon functions, and so applies without regard to the type of distributions.
Suppose that is a sequence of indendent real-valued random variables and that has distribution function for .
has distribution function given by for .
has distribution function given by for .
Details:
Note that since is the maximum of the variables, . Hence by independence,
Note that since as the minimum of the variables, . Hence by independence,
From part (a), note that the product of distribution functions is another distribution function. From part (b), the product of right-tail distribution functions is a right-tail distribution function. In the reliability setting, where the random variables are nonnegative, the last statement means that the product of reliability functions is another reliability function. If has a continuous distribution with probability density function for each , then and also have continuous distributions, and their probability density functions can be obtained by differentiating the distribution functions in parts (a) and (b) of [21]. The computations are straightforward using the product rule for derivatives, but the results are a bit of a mess.
The formulas in [21] are particularly nice when the random variables are identically distributed, in addition to being independent
Suppose that is a sequence of independent real-valued random variables, with common distribution function .
has distribution function given by for .
has distribution function given by for .
In particular, it follows that a positive integer power of a distribution function is a distribution function. More generally, it's easy to see that every positive power of a distribution function is a distribution function. How could we construct a non-integer power of a distribution function in a probabilistic way?
Suppose that is a sequence of independent real-valued random variables, with a common continuous distribution that has probability density function .
has probability density function given by for .
has probability density function given by for .
Coordinate Systems
For our next discussion, we will consider transformations that correspond to common distance-angle based coordinate systems—polar coordinates in the plane, and cylindrical and spherical coordinates in 3-dimensional space. First, for , let denote the standard polar coordinates corresponding to the Cartesian coordinates , so that is the radial distance and is the polar angle.
Polar coordinates. Stover, Christopher and Weisstein, Eric W. "Polar Coordinates." From MathWorld—A Wolfram Web Resource. http://mathworld.wolfram.com/PolarCoordinates.html
It's best to give the inverse transformation: , . As we all know from calculus, the Jacobian of the transformation is . Hence the following result is an immediate consequence of our change of variables [9]:
Suppose that has a continuous distribution on with probability density function , and that are the polar coordinates of . Then has probability density function given by
Next, for , let denote the standard cylindrical coordinates, so that are the standard polar coordinates of as above, and coordinate is left unchanged. Given our previous result, the one for cylindrical coordinates should come as no surprise.
Suppose that has a continuous distribution on with probability density function , and that are the cylindrical coordinates of . Then has probability density function given by
Finally, for , let denote the standard spherical coordinates corresponding to the Cartesian coordinates , so that is the radial distance, is the azimuth angle, and is the polar angle. (In spite of our use of the word standard, different notations and conventions are used in different subjects.)
Spherical coordinates, By Dmcq—Own work, CC BY-SA 3.0, Wikipedia
Once again, it's best to give the inverse transformation: , , . As we remember from calculus, the absolute value of the Jacobian is . Hence the following result is an immediate consequence of the change of variables theorem in [9]:
Suppose that has a continuous distribution on with probability density function , and that are the spherical coordinates of . Then has probability density function given by
Sign and Absolute Value
Our next discussion concerns the sign and absolute value of a real-valued random variable.
Suppose that has a continuous distribution on with distribution function and probability density function .
has distribution function given by for .
has probability density function given by for .
Details:
for .
This follows from part (a) by taking derivatives with respect to .
Recall that the sign function on (not to be confused, of course, with the sine function) is defined as follows:
Suppose again that has a continuous distribution on with distribution function and probability density function , and suppose in addition that the distribution of is symmetric about 0. Then
has distribution function given by for .
has probability density function given by for .
is uniformly distributed on .
and are independent.
Details:
This follows from the previous theorem, since for by symmetry.
This follows from part (a) by taking derivatives.
Note that and so also.
If then
Examples and Applications
This subsection contains computational exercises, many of which involve special parametric families of distributions. It is always interesting when a random variable from one parametric family can be transformed into a variable from another family. It is also interesting when a parametric family is closed or invariant under some transformation on the variables in the family. Often, such properties are what make the parametric families special in the first place. Please note these properties when they occur.
Dice
Recall that a standard die is an ordinary 6-sided die, with faces labeled from 1 to 6 (usually in the form of dots). A fair die is one in which the faces are equally likely. An ace-six flat die is a standard die in which faces 1 and 6 occur with probability each and the other faces with probability each.
Suppose that two six-sided dice are rolled and the sequence of scores is recorded. Find the probability density function of , the sum of the scores, in each of the following cases:
Both dice are standard and fair.
Both dice are ace-six flat.
The first die is standard and fair, and the second is ace-six flat
The dice are both fair, but the first die has faces labeled 1, 2, 2, 3, 3, 4 and the second die has faces labeled 1, 3, 4, 5, 6, 8.
Details:
Let denote the sum of the scores.
2
3
4
5
6
7
8
9
10
11
12
2
3
4
5
6
7
8
9
10
11
12
2
3
4
5
6
7
8
9
10
11
12
The distribution is the same as for two standard, fair dice in (a).
In the dice experiment, select two dice and select the sum random variable. Run the simulation 1000 times and compare the empirical density function to the probability density function for each of the following cases:
fair dice
ace-six flat dice
Suppose that standard, fair dice are rolled. Find the probability density function of the following variables:
the minimum score
the maximum score.
Details:
Let denote the minimum score and the maximum score.
In the dice experiment, select fair dice and select each of the following random variables. Vary with the scrollbar and note the shape of the density function. With , run the simulation 1000 times and note the agreement between the empirical density function and the probability density function.
minimum score
maximum score.
Uniform Distributions
Recall again that for , is given the Borel -algebra and that denotes Lebesgue measure on . So is length on sets in , is area for sets in , is volume for sets in , and in general, is -dimensional volume for sets in . If with then the uniform distribution on is the continuous distribution with constant probability density function defined by for .
Let . Find the probability density function of and sketch the graph in each of the following cases:
is uniformly distributed on the interval .
is uniformly distributed on the interval .
is uniformly distributed on the interval .
Details:
Compare the distributions in [33]. In part (c), note that even a simple transformation of a simple distribution can produce a complicated distribution. In this particular case, the complexity is caused by the fact that is one-to-one on part of the domain and two-to-one on the other part .
On the other hand, the uniform distribution is preserved under a linear transformation of the random variable.
Suppose that with and that has the uniform distribution on . Let , where and is an invertible matrix. Then is uniformly distributed on .
Details:
This follows directly from the general result on linear transformations in [11]. Note that the density function of is constant on .
For the following three exercises, recall that the standard uniform distribution is the uniform distribution on the interval .
Suppose that and are independent and that each has the standard uniform distribution. Let , , , . Find the probability density function of each of the follow:
Details:
for in the square region with vertices . So is uniformly distributed on .
for
Suppose that , , and are independent, and that each has the standard uniform distribution. Find the probability density function of .
Details:
for in the rectangular region with vertices . So is uniformly distributed on .
Suppose that is a sequence of independent random variables, each with the standard uniform distribution. Find the distribution function and probability density function of the following variables.
Details:
and , both for
and , both for
Both distributions in [37] are beta distributions. More generally, all of the order statistics from a random sample of standard uniform variables have beta distributions, one of the reasons for the importance of this family of distributions.
Set (this gives the minimum ). Vary with the scrollbar and note the shape of the probability density function. With , run the simulation 1000 times and note the agreement between the empirical density function and the true probability density function.
Vary with the scrollbar, set each time (this gives the maximum ), and note the shape of the probability density function. With run the simulation 1000 times and compare the empirical density function and the probability density function.
Let denote the probability density function of the standard uniform distribution.
Compute
Compute
Graph , , and on the same set of axes.
Details:
In [39], you can see the behavior predicted by the central limit theorem beginning to emerge. Recall that if is a sequence of independent random variables, each with the standard uniform distribution, then , , and are the probability density functions of , , and , respectively. More generally, if is a sequence of independent random variables, each with the standard uniform distribution, then the distribution of (which has probability density function ) is known as the Irwin-Hall distribution with parameter .
Open the Special Distribution Simulator and select the Irwin-Hall distribution. Vary the parameter from 1 to 3 and note the shape of the probability density function. (These are the density functions in [39]). For each value of , run the simulation 1000 times and compare the empricial density function and the probability density function.
Simulations
A remarkable fact is that the standard uniform distribution can be transformed into almost any other distribution on . This is particularly important for simulations, since many computer languages have an algorithm for generating random numbers, which are simulations of independent variables, each with the standard uniform distribution. Conversely, any continuous distribution supported on an interval of can be transformed into the standard uniform distribution.
Suppose first that is a distribution function for a distribution on (which may be discrete, continuous, or mixed), and let denote the quantile function.
Suppose that has the standard uniform distribution. Then has distribution function .
Details:
The critical property satisfied by the quantile function (regardless of the type of distribution) is if and only if for and . Hence for , .
Assuming that we can compute , the previous exercise shows how we can simulate a distribution with distribution function . To rephrase the result, we can simulate a variable with distribution function by simply computing a random quantile. Most of the apps in this project use this method of simulation. The first image below shows the graph of the distribution function of a rather complicated mixed distribution, represented in blue on the horizontal axis. In the second image, note how the uniform distribution on , represented by the thick red line, is transformed, via the quantile function, into the given distribution.
The random quantile method of simulation
There is a partial converse [41], for continuous distributions.
Suppose that has a continuous distribution on an interval Then has the standard uniform distribution.
Details:
For recall that is a quantile of order . Since has a continuous distribution,
Hence is uniformly distributed on .
Show how to simulate the uniform distribution on the interval with a random number. Using your calculator, simulate 5 values from the uniform distribution on the interval .
Details:
where is a random number.
Beta Distributions
Suppose that has the probability density function given by for . Find the probability density function of each of the following:
Details:
, for
for
for
Random variables , , and in [44] have beta distributions, the same family of distributions that we saw in [37] for the minimum and maximum of independent standard uniform variables. In general, beta distributions are widely used to model random proportions and probabilities, as well as physical quantities that take values in closed bounded intervals (which after a change of units can be taken to be ). On the other hand, has a Pareto distribution, named for Vilfredo Pareto.
Suppose that the radius of a sphere has a beta distribution probability density function given by for . Find the probability density function of each of the following:
The circumference
The surface area
The volume
Details:
for
for
for
Suppose that the grades on a test are described by the random variable where has the beta distribution with probability density function given by for . The grades are generally low, so the teacher decides to curve the grades using the transformation . Find the probability density function of
Details:
for .
for
Bernoulli Trials
Recall that a Bernoulli trials sequence is a sequence of independent, identically distributed indicator random variables. In the usual terminology of reliability theory, means failure on trial , while means success on trial . The basic parameter of the process is the probability of success , so . The random process is named for Jacob Bernoulli.
For , the probability density function of the trial variable is for .
Details:
By definition, and . These can be combined succinctly with the formula for .
Now let denote the number of successes in the first trials, so that for .
has the probability density function given by
Details:
We have seen this derivation before. The number of bit strings of length with 1 occurring exactly times is for . By the Bernoulli trials assumptions, the probability of each such bit string is .
Part (a) can be proved directly from the definition of convolution, but the result also follows simply from the fact that .
From part (b) it follows that if and are independent variables, and that has the binomial distribution with parameters and while has the binomial distribution with parameter and , then has the binomial distribution with parameter and .
Find the probability density function of the difference between the number of successes and the number of failures in Bernoulli trials with success parameter
Details:
for
The Poisson Distribution
Recall that the Poisson distribution with parameter has probability density function given by
This distribution is named for Simeon Poisson and is widely used to model the number of random points in a region of time or space; the parameter is proportional to the size of the region.
If then .
Details:
Let . Using the definition of convolution and the binomial theorem we have
The last result means that if and are independent variables, and has the Poisson distribution with parameter while has the Poisson distribution with parameter , then has the Poisson distribution with parameter . In terms of the Poisson model, could represent the number of points in a region and the number of points in a region (of the appropriate sizes so that the parameters are and respectively). The independence of and corresponds to the regions and being disjoint. Then is the number of points in .
The Exponential Distribution
Recall that the exponential distribution with rate parameter has probability density function given by for . This distribution is often used to model random times such as failure times and lifetimes. In particular, the times between arrivals in the Poisson model of random points in time have independent, identically distributed exponential distributions.
Show how to simulate, with a random number, the exponential distribution with rate parameter . Using your calculator, simulate 5 values from the exponential distribution with parameter .
Details:
where is a random number. Since is also a random number, a simpler solution is .
For [53] next, recall that the floor and ceiling functions on are defined by
Suppose that has the exponential distribution with rate parameter . Find the probability density function of each of the following random variables:
Details:
for
for
Note that the distributions [53] are geometric distributions on and on , respectively. In many respects, the geometric distribution is a discrete version of the exponential distribution.
Suppose that has the exponential distribution with rate parameter . Find the probability density function of each of the following random variables:
Suppose that and are independent random variables, each having the exponential distribution with parameter 1. Let .
Find the distribution function of .
Find the probability density function of .
Details:
Suppose that has the exponential distribution with rate parameter , has the exponential distribution with rate parameter , and that and are independent. Find the probability density function of in each of the following cases.
Details:
for
for
Suppose that is a sequence of independent random variables, and that has the exponential distribution with rate parameter for each .
Find the probability density function of .
Find the distribution function of .
Find the probability density function of in the special case that for each .
Details:
for where
for
for
Note that the minimum in part (a) of [57] has the exponential distribution with parameter . In particular, suppose that a series system has independent components, each with an exponentially distributed lifetime. Then the lifetime of the system is also exponentially distributed, and the failure rate of the system is the sum of the component failure rates.
Set (this gives the minimum ). Vary with the scrollbar and note the shape of the probability density function. With , run the simulation 1000 times and compare the empirical density function and the probability density function.
Vary with the scrollbar and set each time (this gives the maximum ). Note the shape of the density function. With , run the simulation 1000 times and compare the empirical density function and the probability density function.
Suppose again that is a sequence of independent random variables, and that has the exponential distribution with rate parameter for each . Then
Details:
When , the result was shown in the section on joint distributions. Returning to the case of general , note that for all if and only if . Note that he minimum on the right is independent of and by [57], has an exponential distribution with parameter .
The result in the previous exercise is very important in the theory of continuous-time Markov chains. If we have a bunch of independent alarm clocks, with exponentially distributed alarm times, then the probability that clock is the first one to sound is .
The Gamma Distribution
Recall that the (standard) gamma distribution with shape parameter has probability density function
With a positive integer shape parameter, as we have here, it is also referred to as the Erlang distribution, named for Agner Erlang. This distribution is widely used to model random times under certain basic assumptions. In particular, the th arrival times in the Poisson model of random points in time has the gamma distribution with parameter .
Let , and note that this is the probability density function of the exponential distribution with parameter 1, which was the topic of our last discussion.
If then
Details:
Part (a) hold trivially when . Also, for ,
Part (b) follows from (a).
Part (b) means that if has the gamma distribution with shape parameter and has the gamma distribution with shape parameter , and if and are independent, then has the gamma distribution with shape parameter . In the context of the Poisson model, part (a) means that the th arrival time is the sum of the independent interarrival times, which have a common exponential distribution.
Suppose that has the gamma distribution with shape parameter . Find the probability density function of .
Details:
for
The Pareto Distribution
Recall that the Pareto distribution with shape parameter has probability density function given by
Members of this family have already come up in several of the previous exercises. The Pareto distribution, named for Vilfredo Pareto, is a heavy-tailed distribution often used for modeling income and other financial variables.
Suppose that has the Pareto distribution with shape parameter . Find the probability density function of each of the following random variables:
Details:
for
for
for
In [62], also has a Pareto distribution but with parameter ; has the beta distribution with parameters and ; and has theexponential distribution with rate parameter .
Show how to simulate, with a random number, the Pareto distribution with shape parameter . Using your calculator, simulate 5 values from the Pareto distribution with shape parameter .
Details:
Using the random quantile method in [41], where is a random number. More simply, , since is also a random number.
The Normal Distribution
Recall that the standard normal distribution has probability density function given by
Suppose that has the standard normal distribution, and that and .
Find the probability density function of
Sketch the graph of , noting the important qualitative features.
Details:
for
is symmetric about . increases and then decreases, with mode . is concave upward, then downward, then upward again, with inflection points at . as and as
Random variable has the normal distribution with location parameter and scale parameter . The normal distribution is perhaps the most important distribution in probability and mathematical statistics, primarily because of the cemtral limit theorem, one of the fundamental theorems. It is widely used to model physical measurements of all types that are subject to small, random errors.
Suppose that has the standard normal distribution. Find the probability density function of and sketch the graph.
Suppose that and are independent random variables, each with the standard normal distribution, and let be the standard polar coordinates . Find the probability density function of
Details:
Note that the joint PDF of is
From [24], the PDF of is
From the factorization theorem for joint PDFs, it follows that has probability density function for , is uniformly distributed on , and that and are independent.
The standard normal distribution does not have a simple, closed form quantile function, so the random quantile method of simulation does not work well. However, the last exercise points the way to an alternative method of simulation.
Show how to simulate a pair of independent, standard normal variables with a pair of random numbers. Using your calculator, simulate 6 values from the standard normal distribution.
Details:
The Rayleigh distribution in the last exercise has CDF for , and hence quantle function for . Thus we can simulate the polar radius with a random number by , or a bit more simply by , since is also a random number. We can simulate the polar angle with a random number by . Then, a pair of independent, standard normal variables can be simulated by , .
The Cauchy Distribution
Suppose that and are independent random variables, each with the standard normal distribution. Find the probability density function of .
Details:
As usual, let denote the standard normal PDF, so that for . Using the theorem on quotient in [20] above, the PDF of is given by
Using symmetry and a simple substitution,
Suppose that a light source is 1 unit away from position 0 on an infinite straight wall. We shine the light at the wall an angle to the perpendicular, where is uniformly distributed on . Find the probability density function of the position of the light beam on the wall.
Details:
The PDF of is for . The transformation is so the inverse transformation is . Recall that , so by the change of variables formula [8], has PDF given by
Thus, also has the standard Cauchy distribution. Clearly we can simulate a value of the Cauchy distribution by where is a random number. This is the random quantile method.
Open the Cauchy experiment, which is a simulation of the light problem in the previous exercise. Keep the default parameter values and run the experiment in single step mode a few times. Then run the experiment 1000 times and compare the empirical density function and the probability density function.