Dave Goldsman - gatech.edu
Transcript of Dave Goldsman - gatech.edu
![Page 1: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/1.jpg)
4. Distributions
Dave Goldsman
H. Milton Stewart School of Industrial and Systems EngineeringGeorgia Institute of Technology
8/5/20
ISYE 6739 — Goldsman 8/5/20 1 / 108
![Page 2: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/2.jpg)
Outline
1 Bernoulli and Binomial Distributions2 Hypergeometric Distribution3 Geometric and Negative Binomial Distributions4 Poisson Distribution5 Uniform, Exponential, and Friends6 Other Continuous Distributions7 Normal Distribution: Basics8 Standard Normal Distribution9 Sample Mean of Normals
10 The Central Limit Theorem + Proof11 Central Limit Theorem Examples12 Extensions — Multivariate Normal Distribution13 Extensions — Lognormal Distribution14 Computer Stuff
ISYE 6739 — Goldsman 8/5/20 2 / 108
![Page 3: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/3.jpg)
Bernoulli and Binomial Distributions
Lesson 4.1 — Bernoulli and Binomial Distributions
Goal: We’ll discuss lots of interesting distributions in this module.
The module will be a compendium of results, some of which we’ll prove, andsome of which we’ve already seen previously.
Special emphasis will be placed on the Normal distribution, because it’s soimportant and has so many implications, including the Central LimitTheorem.
In the next few lessons we’ll discuss some important discrete distributions:
Bernoulli and Binomial Distributions
Hypergeometric Distribution
Geometric and Negative Binomial Distributions
Poisson Distribution
ISYE 6739 — Goldsman 8/5/20 3 / 108
![Page 4: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/4.jpg)
Bernoulli and Binomial Distributions
Lesson 4.1 — Bernoulli and Binomial Distributions
Goal: We’ll discuss lots of interesting distributions in this module.
The module will be a compendium of results, some of which we’ll prove, andsome of which we’ve already seen previously.
Special emphasis will be placed on the Normal distribution, because it’s soimportant and has so many implications, including the Central LimitTheorem.
In the next few lessons we’ll discuss some important discrete distributions:
Bernoulli and Binomial Distributions
Hypergeometric Distribution
Geometric and Negative Binomial Distributions
Poisson Distribution
ISYE 6739 — Goldsman 8/5/20 3 / 108
![Page 5: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/5.jpg)
Bernoulli and Binomial Distributions
Lesson 4.1 — Bernoulli and Binomial Distributions
Goal: We’ll discuss lots of interesting distributions in this module.
The module will be a compendium of results, some of which we’ll prove, andsome of which we’ve already seen previously.
Special emphasis will be placed on the Normal distribution, because it’s soimportant and has so many implications, including the Central LimitTheorem.
In the next few lessons we’ll discuss some important discrete distributions:
Bernoulli and Binomial Distributions
Hypergeometric Distribution
Geometric and Negative Binomial Distributions
Poisson Distribution
ISYE 6739 — Goldsman 8/5/20 3 / 108
![Page 6: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/6.jpg)
Bernoulli and Binomial Distributions
Lesson 4.1 — Bernoulli and Binomial Distributions
Goal: We’ll discuss lots of interesting distributions in this module.
The module will be a compendium of results, some of which we’ll prove, andsome of which we’ve already seen previously.
Special emphasis will be placed on the Normal distribution, because it’s soimportant and has so many implications, including the Central LimitTheorem.
In the next few lessons we’ll discuss some important discrete distributions:
Bernoulli and Binomial Distributions
Hypergeometric Distribution
Geometric and Negative Binomial Distributions
Poisson Distribution
ISYE 6739 — Goldsman 8/5/20 3 / 108
![Page 7: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/7.jpg)
Bernoulli and Binomial Distributions
Lesson 4.1 — Bernoulli and Binomial Distributions
Goal: We’ll discuss lots of interesting distributions in this module.
The module will be a compendium of results, some of which we’ll prove, andsome of which we’ve already seen previously.
Special emphasis will be placed on the Normal distribution, because it’s soimportant and has so many implications, including the Central LimitTheorem.
In the next few lessons we’ll discuss some important discrete distributions:
Bernoulli and Binomial Distributions
Hypergeometric Distribution
Geometric and Negative Binomial Distributions
Poisson Distribution
ISYE 6739 — Goldsman 8/5/20 3 / 108
![Page 8: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/8.jpg)
Bernoulli and Binomial Distributions
Definition: The Bernoulli distribution with parameter p is given by
X =
{1 w.p. p (“success”)
0 w.p. q (“failure”)
Recall: E[X] = p, Var(X) = pq, and MX(t) = pet + q.
Definition: The Binomial distribution with parameters n and p is givenby
P (Y = k) =
(n
k
)pkqn−k, k = 0, 1, . . . , n.
ISYE 6739 — Goldsman 8/5/20 4 / 108
![Page 9: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/9.jpg)
Bernoulli and Binomial Distributions
Definition: The Bernoulli distribution with parameter p is given by
X =
{1 w.p. p (“success”)
0 w.p. q (“failure”)
Recall: E[X] = p, Var(X) = pq, and MX(t) = pet + q.
Definition: The Binomial distribution with parameters n and p is givenby
P (Y = k) =
(n
k
)pkqn−k, k = 0, 1, . . . , n.
ISYE 6739 — Goldsman 8/5/20 4 / 108
![Page 10: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/10.jpg)
Bernoulli and Binomial Distributions
Definition: The Bernoulli distribution with parameter p is given by
X =
{1 w.p. p (“success”)
0 w.p. q (“failure”)
Recall: E[X] = p, Var(X) = pq, and MX(t) = pet + q.
Definition: The Binomial distribution with parameters n and p is givenby
P (Y = k) =
(n
k
)pkqn−k, k = 0, 1, . . . , n.
ISYE 6739 — Goldsman 8/5/20 4 / 108
![Page 11: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/11.jpg)
Bernoulli and Binomial Distributions
Example: Toss 2 dice and take the sum; repeat 5 times. Let Y be the numberof 7’s you see.
Y ∼ Bin(5, 1/6). Then, e.g.,
P (Y = 4) =
(5
4
)(1
6
)4(5
6
)5−4. 2
Theorem: X1, . . . , Xniid∼ Bern(p)⇒ Y ≡
∑ni=1Xi ∼ Bin(n, p).
Proof: This kind of result can easily be proved by a moment generatingfunction uniqueness argument, as we mentioned in previous modules. 2
Think of the Binomial as the number of successes from n Bern(p) trials.
ISYE 6739 — Goldsman 8/5/20 5 / 108
![Page 12: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/12.jpg)
Bernoulli and Binomial Distributions
Example: Toss 2 dice and take the sum; repeat 5 times. Let Y be the numberof 7’s you see. Y ∼ Bin(5, 1/6). Then, e.g.,
P (Y = 4) =
(5
4
)(1
6
)4(5
6
)5−4. 2
Theorem: X1, . . . , Xniid∼ Bern(p)⇒ Y ≡
∑ni=1Xi ∼ Bin(n, p).
Proof: This kind of result can easily be proved by a moment generatingfunction uniqueness argument, as we mentioned in previous modules. 2
Think of the Binomial as the number of successes from n Bern(p) trials.
ISYE 6739 — Goldsman 8/5/20 5 / 108
![Page 13: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/13.jpg)
Bernoulli and Binomial Distributions
Example: Toss 2 dice and take the sum; repeat 5 times. Let Y be the numberof 7’s you see. Y ∼ Bin(5, 1/6). Then, e.g.,
P (Y = 4) =
(5
4
)(1
6
)4(5
6
)5−4. 2
Theorem: X1, . . . , Xniid∼ Bern(p)⇒ Y ≡
∑ni=1Xi ∼ Bin(n, p).
Proof: This kind of result can easily be proved by a moment generatingfunction uniqueness argument, as we mentioned in previous modules. 2
Think of the Binomial as the number of successes from n Bern(p) trials.
ISYE 6739 — Goldsman 8/5/20 5 / 108
![Page 14: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/14.jpg)
Bernoulli and Binomial Distributions
Example: Toss 2 dice and take the sum; repeat 5 times. Let Y be the numberof 7’s you see. Y ∼ Bin(5, 1/6). Then, e.g.,
P (Y = 4) =
(5
4
)(1
6
)4(5
6
)5−4. 2
Theorem: X1, . . . , Xniid∼ Bern(p)⇒ Y ≡
∑ni=1Xi ∼ Bin(n, p).
Proof: This kind of result can easily be proved by a moment generatingfunction uniqueness argument, as we mentioned in previous modules. 2
Think of the Binomial as the number of successes from n Bern(p) trials.
ISYE 6739 — Goldsman 8/5/20 5 / 108
![Page 15: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/15.jpg)
Bernoulli and Binomial Distributions
Example: Toss 2 dice and take the sum; repeat 5 times. Let Y be the numberof 7’s you see. Y ∼ Bin(5, 1/6). Then, e.g.,
P (Y = 4) =
(5
4
)(1
6
)4(5
6
)5−4. 2
Theorem: X1, . . . , Xniid∼ Bern(p)⇒ Y ≡
∑ni=1Xi ∼ Bin(n, p).
Proof: This kind of result can easily be proved by a moment generatingfunction uniqueness argument, as we mentioned in previous modules. 2
Think of the Binomial as the number of successes from n Bern(p) trials.
ISYE 6739 — Goldsman 8/5/20 5 / 108
![Page 16: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/16.jpg)
Bernoulli and Binomial Distributions
Example: Toss 2 dice and take the sum; repeat 5 times. Let Y be the numberof 7’s you see. Y ∼ Bin(5, 1/6). Then, e.g.,
P (Y = 4) =
(5
4
)(1
6
)4(5
6
)5−4. 2
Theorem: X1, . . . , Xniid∼ Bern(p)⇒ Y ≡
∑ni=1Xi ∼ Bin(n, p).
Proof: This kind of result can easily be proved by a moment generatingfunction uniqueness argument, as we mentioned in previous modules. 2
Think of the Binomial as the number of successes from n Bern(p) trials.
ISYE 6739 — Goldsman 8/5/20 5 / 108
![Page 17: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/17.jpg)
Bernoulli and Binomial Distributions
Theorem: Y ∼ Bin(n, p) implies
E[Y ] = E[ n∑i=1
Xi
]=
n∑i=1
E[Xi] = np.
Similarly,Var(Y ) = npq.
We’ve already seen that
MY (t) = (pet + q)n.
Theorem: Certain Binomials add up: If Y1, . . . , Yk are independent andYi ∼ Bin(ni, p), then
k∑i=1
Yi ∼ Bin( k∑i=1
ni, p).
ISYE 6739 — Goldsman 8/5/20 6 / 108
![Page 18: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/18.jpg)
Bernoulli and Binomial Distributions
Theorem: Y ∼ Bin(n, p) implies
E[Y ] = E[ n∑i=1
Xi
]=
n∑i=1
E[Xi] = np.
Similarly,Var(Y ) = npq.
We’ve already seen that
MY (t) = (pet + q)n.
Theorem: Certain Binomials add up: If Y1, . . . , Yk are independent andYi ∼ Bin(ni, p), then
k∑i=1
Yi ∼ Bin( k∑i=1
ni, p).
ISYE 6739 — Goldsman 8/5/20 6 / 108
![Page 19: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/19.jpg)
Bernoulli and Binomial Distributions
Theorem: Y ∼ Bin(n, p) implies
E[Y ] = E[ n∑i=1
Xi
]=
n∑i=1
E[Xi] = np.
Similarly,Var(Y ) = npq.
We’ve already seen that
MY (t) = (pet + q)n.
Theorem: Certain Binomials add up: If Y1, . . . , Yk are independent andYi ∼ Bin(ni, p), then
k∑i=1
Yi ∼ Bin( k∑i=1
ni, p).
ISYE 6739 — Goldsman 8/5/20 6 / 108
![Page 20: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/20.jpg)
Bernoulli and Binomial Distributions
Theorem: Y ∼ Bin(n, p) implies
E[Y ] = E[ n∑i=1
Xi
]=
n∑i=1
E[Xi] = np.
Similarly,Var(Y ) = npq.
We’ve already seen that
MY (t) = (pet + q)n.
Theorem: Certain Binomials add up: If Y1, . . . , Yk are independent andYi ∼ Bin(ni, p),
then
k∑i=1
Yi ∼ Bin( k∑i=1
ni, p).
ISYE 6739 — Goldsman 8/5/20 6 / 108
![Page 21: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/21.jpg)
Bernoulli and Binomial Distributions
Theorem: Y ∼ Bin(n, p) implies
E[Y ] = E[ n∑i=1
Xi
]=
n∑i=1
E[Xi] = np.
Similarly,Var(Y ) = npq.
We’ve already seen that
MY (t) = (pet + q)n.
Theorem: Certain Binomials add up: If Y1, . . . , Yk are independent andYi ∼ Bin(ni, p), then
k∑i=1
Yi ∼ Bin( k∑i=1
ni, p).
ISYE 6739 — Goldsman 8/5/20 6 / 108
![Page 22: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/22.jpg)
Hypergeometric Distribution
1 Bernoulli and Binomial Distributions
2 Hypergeometric Distribution3 Geometric and Negative Binomial Distributions
4 Poisson Distribution5 Uniform, Exponential, and Friends6 Other Continuous Distributions7 Normal Distribution: Basics8 Standard Normal Distribution9 Sample Mean of Normals
10 The Central Limit Theorem + Proof
11 Central Limit Theorem Examples
12 Extensions — Multivariate Normal Distribution13 Extensions — Lognormal Distribution
14 Computer Stuff
ISYE 6739 — Goldsman 8/5/20 7 / 108
![Page 23: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/23.jpg)
Hypergeometric Distribution
Lesson 4.2 — Hypergeometric Distribution
Definition: You have a objects of type 1 and b objects of type 2.
Select n objects without replacement from the a+ b.
Let X be the number of type 1’s selected. Then X has the Hypergeometricdistribution with pmf
P (X = k) =
(ak
)(b
n−k)(
a+bn
) , k = 0, 1, . . . ,min(a, n− b, n).
Example: 25 sox in a box. 15 red, 10 blue. Pick 7 w/o replacement.
P (exactly 3 reds are picked) =
(153
)(104
)(257
) . 2
ISYE 6739 — Goldsman 8/5/20 8 / 108
![Page 24: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/24.jpg)
Hypergeometric Distribution
Lesson 4.2 — Hypergeometric Distribution
Definition: You have a objects of type 1 and b objects of type 2.
Select n objects without replacement from the a+ b.
Let X be the number of type 1’s selected. Then X has the Hypergeometricdistribution with pmf
P (X = k) =
(ak
)(b
n−k)(
a+bn
) , k = 0, 1, . . . ,min(a, n− b, n).
Example: 25 sox in a box. 15 red, 10 blue. Pick 7 w/o replacement.
P (exactly 3 reds are picked) =
(153
)(104
)(257
) . 2
ISYE 6739 — Goldsman 8/5/20 8 / 108
![Page 25: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/25.jpg)
Hypergeometric Distribution
Lesson 4.2 — Hypergeometric Distribution
Definition: You have a objects of type 1 and b objects of type 2.
Select n objects without replacement from the a+ b.
Let X be the number of type 1’s selected. Then X has the Hypergeometricdistribution with pmf
P (X = k) =
(ak
)(b
n−k)(
a+bn
) , k = 0, 1, . . . ,min(a, n− b, n).
Example: 25 sox in a box. 15 red, 10 blue. Pick 7 w/o replacement.
P (exactly 3 reds are picked) =
(153
)(104
)(257
) . 2
ISYE 6739 — Goldsman 8/5/20 8 / 108
![Page 26: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/26.jpg)
Hypergeometric Distribution
Lesson 4.2 — Hypergeometric Distribution
Definition: You have a objects of type 1 and b objects of type 2.
Select n objects without replacement from the a+ b.
Let X be the number of type 1’s selected. Then X has the Hypergeometricdistribution with pmf
P (X = k) =
(ak
)(b
n−k)(
a+bn
) , k = 0, 1, . . . ,min(a, n− b, n).
Example: 25 sox in a box. 15 red, 10 blue. Pick 7 w/o replacement.
P (exactly 3 reds are picked) =
(153
)(104
)(257
) . 2
ISYE 6739 — Goldsman 8/5/20 8 / 108
![Page 27: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/27.jpg)
Hypergeometric Distribution
Lesson 4.2 — Hypergeometric Distribution
Definition: You have a objects of type 1 and b objects of type 2.
Select n objects without replacement from the a+ b.
Let X be the number of type 1’s selected. Then X has the Hypergeometricdistribution with pmf
P (X = k) =
(ak
)(b
n−k)(
a+bn
) , k = 0, 1, . . . ,min(a, n− b, n).
Example: 25 sox in a box. 15 red, 10 blue. Pick 7 w/o replacement.
P (exactly 3 reds are picked) =
(153
)(104
)(257
) . 2
ISYE 6739 — Goldsman 8/5/20 8 / 108
![Page 28: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/28.jpg)
Hypergeometric Distribution
Lesson 4.2 — Hypergeometric Distribution
Definition: You have a objects of type 1 and b objects of type 2.
Select n objects without replacement from the a+ b.
Let X be the number of type 1’s selected. Then X has the Hypergeometricdistribution with pmf
P (X = k) =
(ak
)(b
n−k)(
a+bn
) , k = 0, 1, . . . ,min(a, n− b, n).
Example: 25 sox in a box. 15 red, 10 blue. Pick 7 w/o replacement.
P (exactly 3 reds are picked) =
(153
)(104
)(257
) . 2
ISYE 6739 — Goldsman 8/5/20 8 / 108
![Page 29: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/29.jpg)
Hypergeometric Distribution
Lesson 4.2 — Hypergeometric Distribution
Definition: You have a objects of type 1 and b objects of type 2.
Select n objects without replacement from the a+ b.
Let X be the number of type 1’s selected. Then X has the Hypergeometricdistribution with pmf
P (X = k) =
(ak
)(b
n−k)(
a+bn
) , k = 0, 1, . . . ,min(a, n− b, n).
Example: 25 sox in a box. 15 red, 10 blue. Pick 7 w/o replacement.
P (exactly 3 reds are picked) =
(153
)(104
)(257
) . 2
ISYE 6739 — Goldsman 8/5/20 8 / 108
![Page 30: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/30.jpg)
Hypergeometric Distribution
Theorem: After some algebra, it turns out that
E[X] = n
(a
a+ b
)and
Var(X) = n
(a
a+ b
)(1− a
a+ b
)(a+ b− na+ b− 1
).
Remark: Here, aa+b here plays the role of p in the Binomial distribution.
And then the corresponding Y ∼ Bin(n, p) results would be
E[Y ] = n
(a
a+ b
)and
Var(Y ) = n
(a
a+ b
)(1− a
a+ b
).
So same mean as the Hypergeometric, but slightly larger variance.
ISYE 6739 — Goldsman 8/5/20 9 / 108
![Page 31: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/31.jpg)
Hypergeometric Distribution
Theorem: After some algebra, it turns out that
E[X] = n
(a
a+ b
)and
Var(X) = n
(a
a+ b
)(1− a
a+ b
)(a+ b− na+ b− 1
).
Remark: Here, aa+b here plays the role of p in the Binomial distribution.
And then the corresponding Y ∼ Bin(n, p) results would be
E[Y ] = n
(a
a+ b
)and
Var(Y ) = n
(a
a+ b
)(1− a
a+ b
).
So same mean as the Hypergeometric, but slightly larger variance.
ISYE 6739 — Goldsman 8/5/20 9 / 108
![Page 32: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/32.jpg)
Hypergeometric Distribution
Theorem: After some algebra, it turns out that
E[X] = n
(a
a+ b
)and
Var(X) = n
(a
a+ b
)(1− a
a+ b
)(a+ b− na+ b− 1
).
Remark: Here, aa+b here plays the role of p in the Binomial distribution.
And then the corresponding Y ∼ Bin(n, p) results would be
E[Y ] = n
(a
a+ b
)and
Var(Y ) = n
(a
a+ b
)(1− a
a+ b
).
So same mean as the Hypergeometric, but slightly larger variance.
ISYE 6739 — Goldsman 8/5/20 9 / 108
![Page 33: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/33.jpg)
Hypergeometric Distribution
Theorem: After some algebra, it turns out that
E[X] = n
(a
a+ b
)and
Var(X) = n
(a
a+ b
)(1− a
a+ b
)(a+ b− na+ b− 1
).
Remark: Here, aa+b here plays the role of p in the Binomial distribution.
And then the corresponding Y ∼ Bin(n, p) results would be
E[Y ] = n
(a
a+ b
)and
Var(Y ) = n
(a
a+ b
)(1− a
a+ b
).
So same mean as the Hypergeometric, but slightly larger variance.
ISYE 6739 — Goldsman 8/5/20 9 / 108
![Page 34: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/34.jpg)
Hypergeometric Distribution
Theorem: After some algebra, it turns out that
E[X] = n
(a
a+ b
)and
Var(X) = n
(a
a+ b
)(1− a
a+ b
)(a+ b− na+ b− 1
).
Remark: Here, aa+b here plays the role of p in the Binomial distribution.
And then the corresponding Y ∼ Bin(n, p) results would be
E[Y ] = n
(a
a+ b
)and
Var(Y ) = n
(a
a+ b
)(1− a
a+ b
).
So same mean as the Hypergeometric, but slightly larger variance.
ISYE 6739 — Goldsman 8/5/20 9 / 108
![Page 35: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/35.jpg)
Hypergeometric Distribution
Theorem: After some algebra, it turns out that
E[X] = n
(a
a+ b
)and
Var(X) = n
(a
a+ b
)(1− a
a+ b
)(a+ b− na+ b− 1
).
Remark: Here, aa+b here plays the role of p in the Binomial distribution.
And then the corresponding Y ∼ Bin(n, p) results would be
E[Y ] = n
(a
a+ b
)and
Var(Y ) = n
(a
a+ b
)(1− a
a+ b
).
So same mean as the Hypergeometric, but slightly larger variance.ISYE 6739 — Goldsman 8/5/20 9 / 108
![Page 36: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/36.jpg)
Geometric and Negative Binomial Distributions
1 Bernoulli and Binomial Distributions
2 Hypergeometric Distribution3 Geometric and Negative Binomial Distributions
4 Poisson Distribution5 Uniform, Exponential, and Friends6 Other Continuous Distributions7 Normal Distribution: Basics8 Standard Normal Distribution9 Sample Mean of Normals
10 The Central Limit Theorem + Proof
11 Central Limit Theorem Examples
12 Extensions — Multivariate Normal Distribution13 Extensions — Lognormal Distribution
14 Computer Stuff
ISYE 6739 — Goldsman 8/5/20 10 / 108
![Page 37: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/37.jpg)
Geometric and Negative Binomial Distributions
Lesson 4.3 — Geometric and Negative Binomial Distributions
Definition: Suppose we consider an infinite sequence of independentBern(p) trials.
Let Z equal the number of trials until the first success is obtained. The eventZ = k corresponds to k − 1 failures, and then a success. Thus,
P (Z = k) = qk−1p, k = 1, 2, . . . ,
and we say that Z has the Geometric distribution with parameter p.
Notation: X ∼ Geom(p).
We’ll get the mean and variance of the Geometric via the mgf. . . .
ISYE 6739 — Goldsman 8/5/20 11 / 108
![Page 38: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/38.jpg)
Geometric and Negative Binomial Distributions
Lesson 4.3 — Geometric and Negative Binomial Distributions
Definition: Suppose we consider an infinite sequence of independentBern(p) trials.
Let Z equal the number of trials until the first success is obtained. The eventZ = k corresponds to k − 1 failures, and then a success. Thus,
P (Z = k) = qk−1p, k = 1, 2, . . . ,
and we say that Z has the Geometric distribution with parameter p.
Notation: X ∼ Geom(p).
We’ll get the mean and variance of the Geometric via the mgf. . . .
ISYE 6739 — Goldsman 8/5/20 11 / 108
![Page 39: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/39.jpg)
Geometric and Negative Binomial Distributions
Lesson 4.3 — Geometric and Negative Binomial Distributions
Definition: Suppose we consider an infinite sequence of independentBern(p) trials.
Let Z equal the number of trials until the first success is obtained. The eventZ = k corresponds to k − 1 failures, and then a success. Thus,
P (Z = k) = qk−1p, k = 1, 2, . . . ,
and we say that Z has the Geometric distribution with parameter p.
Notation: X ∼ Geom(p).
We’ll get the mean and variance of the Geometric via the mgf. . . .
ISYE 6739 — Goldsman 8/5/20 11 / 108
![Page 40: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/40.jpg)
Geometric and Negative Binomial Distributions
Lesson 4.3 — Geometric and Negative Binomial Distributions
Definition: Suppose we consider an infinite sequence of independentBern(p) trials.
Let Z equal the number of trials until the first success is obtained. The eventZ = k corresponds to k − 1 failures, and then a success. Thus,
P (Z = k) = qk−1p, k = 1, 2, . . . ,
and we say that Z has the Geometric distribution with parameter p.
Notation: X ∼ Geom(p).
We’ll get the mean and variance of the Geometric via the mgf. . . .
ISYE 6739 — Goldsman 8/5/20 11 / 108
![Page 41: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/41.jpg)
Geometric and Negative Binomial Distributions
Lesson 4.3 — Geometric and Negative Binomial Distributions
Definition: Suppose we consider an infinite sequence of independentBern(p) trials.
Let Z equal the number of trials until the first success is obtained. The eventZ = k corresponds to k − 1 failures, and then a success. Thus,
P (Z = k) = qk−1p, k = 1, 2, . . . ,
and we say that Z has the Geometric distribution with parameter p.
Notation: X ∼ Geom(p).
We’ll get the mean and variance of the Geometric via the mgf. . . .
ISYE 6739 — Goldsman 8/5/20 11 / 108
![Page 42: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/42.jpg)
Geometric and Negative Binomial Distributions
Lesson 4.3 — Geometric and Negative Binomial Distributions
Definition: Suppose we consider an infinite sequence of independentBern(p) trials.
Let Z equal the number of trials until the first success is obtained. The eventZ = k corresponds to k − 1 failures, and then a success. Thus,
P (Z = k) = qk−1p, k = 1, 2, . . . ,
and we say that Z has the Geometric distribution with parameter p.
Notation: X ∼ Geom(p).
We’ll get the mean and variance of the Geometric via the mgf. . . .
ISYE 6739 — Goldsman 8/5/20 11 / 108
![Page 43: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/43.jpg)
Geometric and Negative Binomial Distributions
Lesson 4.3 — Geometric and Negative Binomial Distributions
Definition: Suppose we consider an infinite sequence of independentBern(p) trials.
Let Z equal the number of trials until the first success is obtained. The eventZ = k corresponds to k − 1 failures, and then a success. Thus,
P (Z = k) = qk−1p, k = 1, 2, . . . ,
and we say that Z has the Geometric distribution with parameter p.
Notation: X ∼ Geom(p).
We’ll get the mean and variance of the Geometric via the mgf. . . .
ISYE 6739 — Goldsman 8/5/20 11 / 108
![Page 44: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/44.jpg)
Geometric and Negative Binomial Distributions
Theorem: The mgf of the Geom(p) is
MZ(t) =pet
1− qet, for t < `n(1/q).
Proof:
MZ(t) = E[etZ ]
=
∞∑k=1
etkqk−1p
= pet∞∑k=0
(qet)k
=pet
1− qet, for qet < 1. 2
ISYE 6739 — Goldsman 8/5/20 12 / 108
![Page 45: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/45.jpg)
Geometric and Negative Binomial Distributions
Theorem: The mgf of the Geom(p) is
MZ(t) =pet
1− qet, for t < `n(1/q).
Proof:
MZ(t) = E[etZ ]
=
∞∑k=1
etkqk−1p
= pet∞∑k=0
(qet)k
=pet
1− qet, for qet < 1. 2
ISYE 6739 — Goldsman 8/5/20 12 / 108
![Page 46: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/46.jpg)
Geometric and Negative Binomial Distributions
Theorem: The mgf of the Geom(p) is
MZ(t) =pet
1− qet, for t < `n(1/q).
Proof:
MZ(t) = E[etZ ]
=
∞∑k=1
etkqk−1p
= pet∞∑k=0
(qet)k
=pet
1− qet, for qet < 1. 2
ISYE 6739 — Goldsman 8/5/20 12 / 108
![Page 47: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/47.jpg)
Geometric and Negative Binomial Distributions
Theorem: The mgf of the Geom(p) is
MZ(t) =pet
1− qet, for t < `n(1/q).
Proof:
MZ(t) = E[etZ ]
=
∞∑k=1
etkqk−1p
= pet∞∑k=0
(qet)k
=pet
1− qet, for qet < 1. 2
ISYE 6739 — Goldsman 8/5/20 12 / 108
![Page 48: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/48.jpg)
Geometric and Negative Binomial Distributions
Theorem: The mgf of the Geom(p) is
MZ(t) =pet
1− qet, for t < `n(1/q).
Proof:
MZ(t) = E[etZ ]
=
∞∑k=1
etkqk−1p
= pet∞∑k=0
(qet)k
=pet
1− qet, for qet < 1. 2
ISYE 6739 — Goldsman 8/5/20 12 / 108
![Page 49: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/49.jpg)
Geometric and Negative Binomial Distributions
Corollary: E[Z] = 1/p.
Proof:
E[Z] =d
dtMZ(t)
∣∣∣t=0
=(1− qet)(pet)− (−qet)(pet)
(1− qet)2∣∣∣t=0
=pet
(1− qet)2∣∣∣t=0
=p
(1− q)2=
1
p. 2
Remark: We could also have proven this directly from the definition ofexpected value.
ISYE 6739 — Goldsman 8/5/20 13 / 108
![Page 50: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/50.jpg)
Geometric and Negative Binomial Distributions
Corollary: E[Z] = 1/p.
Proof:
E[Z] =d
dtMZ(t)
∣∣∣t=0
=(1− qet)(pet)− (−qet)(pet)
(1− qet)2∣∣∣t=0
=pet
(1− qet)2∣∣∣t=0
=p
(1− q)2=
1
p. 2
Remark: We could also have proven this directly from the definition ofexpected value.
ISYE 6739 — Goldsman 8/5/20 13 / 108
![Page 51: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/51.jpg)
Geometric and Negative Binomial Distributions
Corollary: E[Z] = 1/p.
Proof:
E[Z] =d
dtMZ(t)
∣∣∣t=0
=(1− qet)(pet)− (−qet)(pet)
(1− qet)2∣∣∣t=0
=pet
(1− qet)2∣∣∣t=0
=p
(1− q)2=
1
p. 2
Remark: We could also have proven this directly from the definition ofexpected value.
ISYE 6739 — Goldsman 8/5/20 13 / 108
![Page 52: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/52.jpg)
Geometric and Negative Binomial Distributions
Corollary: E[Z] = 1/p.
Proof:
E[Z] =d
dtMZ(t)
∣∣∣t=0
=(1− qet)(pet)− (−qet)(pet)
(1− qet)2∣∣∣t=0
=pet
(1− qet)2∣∣∣t=0
=p
(1− q)2=
1
p. 2
Remark: We could also have proven this directly from the definition ofexpected value.
ISYE 6739 — Goldsman 8/5/20 13 / 108
![Page 53: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/53.jpg)
Geometric and Negative Binomial Distributions
Corollary: E[Z] = 1/p.
Proof:
E[Z] =d
dtMZ(t)
∣∣∣t=0
=(1− qet)(pet)− (−qet)(pet)
(1− qet)2∣∣∣t=0
=pet
(1− qet)2∣∣∣t=0
=p
(1− q)2=
1
p. 2
Remark: We could also have proven this directly from the definition ofexpected value.
ISYE 6739 — Goldsman 8/5/20 13 / 108
![Page 54: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/54.jpg)
Geometric and Negative Binomial Distributions
Corollary: E[Z] = 1/p.
Proof:
E[Z] =d
dtMZ(t)
∣∣∣t=0
=(1− qet)(pet)− (−qet)(pet)
(1− qet)2∣∣∣t=0
=pet
(1− qet)2∣∣∣t=0
=p
(1− q)2=
1
p. 2
Remark: We could also have proven this directly from the definition ofexpected value.
ISYE 6739 — Goldsman 8/5/20 13 / 108
![Page 55: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/55.jpg)
Geometric and Negative Binomial Distributions
Similarly, after a lot of algebra,
E[Z2] =d2
dt2MZ(t)
∣∣∣t=0
=2− pp2
,
and thenVar(Z) = E[Z2]− (E[Z])2 = q/p2. 2
Example: Toss a die repeatedly. What’s the probability that we observe a 3for the first time on the 8th toss?
Answer: The number of tosses we need is Z ∼ Geom(1/6).P (Z = 8) = qk−1p = (5/6)7(1/6). 2
How many tosses would we expect to take?
Answer: E[Z] = 1/p = 6 tosses. 2
ISYE 6739 — Goldsman 8/5/20 14 / 108
![Page 56: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/56.jpg)
Geometric and Negative Binomial Distributions
Similarly, after a lot of algebra,
E[Z2] =d2
dt2MZ(t)
∣∣∣t=0
=2− pp2
,
and thenVar(Z) = E[Z2]− (E[Z])2 = q/p2. 2
Example: Toss a die repeatedly. What’s the probability that we observe a 3for the first time on the 8th toss?
Answer: The number of tosses we need is Z ∼ Geom(1/6).P (Z = 8) = qk−1p = (5/6)7(1/6). 2
How many tosses would we expect to take?
Answer: E[Z] = 1/p = 6 tosses. 2
ISYE 6739 — Goldsman 8/5/20 14 / 108
![Page 57: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/57.jpg)
Geometric and Negative Binomial Distributions
Similarly, after a lot of algebra,
E[Z2] =d2
dt2MZ(t)
∣∣∣t=0
=2− pp2
,
and thenVar(Z) = E[Z2]− (E[Z])2 = q/p2. 2
Example: Toss a die repeatedly. What’s the probability that we observe a 3for the first time on the 8th toss?
Answer: The number of tosses we need is Z ∼ Geom(1/6).P (Z = 8) = qk−1p = (5/6)7(1/6). 2
How many tosses would we expect to take?
Answer: E[Z] = 1/p = 6 tosses. 2
ISYE 6739 — Goldsman 8/5/20 14 / 108
![Page 58: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/58.jpg)
Geometric and Negative Binomial Distributions
Similarly, after a lot of algebra,
E[Z2] =d2
dt2MZ(t)
∣∣∣t=0
=2− pp2
,
and thenVar(Z) = E[Z2]− (E[Z])2 = q/p2. 2
Example: Toss a die repeatedly. What’s the probability that we observe a 3for the first time on the 8th toss?
Answer: The number of tosses we need is Z ∼ Geom(1/6).P (Z = 8) = qk−1p = (5/6)7(1/6). 2
How many tosses would we expect to take?
Answer: E[Z] = 1/p = 6 tosses. 2
ISYE 6739 — Goldsman 8/5/20 14 / 108
![Page 59: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/59.jpg)
Geometric and Negative Binomial Distributions
Similarly, after a lot of algebra,
E[Z2] =d2
dt2MZ(t)
∣∣∣t=0
=2− pp2
,
and thenVar(Z) = E[Z2]− (E[Z])2 = q/p2. 2
Example: Toss a die repeatedly. What’s the probability that we observe a 3for the first time on the 8th toss?
Answer: The number of tosses we need is Z ∼ Geom(1/6).P (Z = 8) = qk−1p
= (5/6)7(1/6). 2
How many tosses would we expect to take?
Answer: E[Z] = 1/p = 6 tosses. 2
ISYE 6739 — Goldsman 8/5/20 14 / 108
![Page 60: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/60.jpg)
Geometric and Negative Binomial Distributions
Similarly, after a lot of algebra,
E[Z2] =d2
dt2MZ(t)
∣∣∣t=0
=2− pp2
,
and thenVar(Z) = E[Z2]− (E[Z])2 = q/p2. 2
Example: Toss a die repeatedly. What’s the probability that we observe a 3for the first time on the 8th toss?
Answer: The number of tosses we need is Z ∼ Geom(1/6).P (Z = 8) = qk−1p = (5/6)7(1/6). 2
How many tosses would we expect to take?
Answer: E[Z] = 1/p = 6 tosses. 2
ISYE 6739 — Goldsman 8/5/20 14 / 108
![Page 61: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/61.jpg)
Geometric and Negative Binomial Distributions
Similarly, after a lot of algebra,
E[Z2] =d2
dt2MZ(t)
∣∣∣t=0
=2− pp2
,
and thenVar(Z) = E[Z2]− (E[Z])2 = q/p2. 2
Example: Toss a die repeatedly. What’s the probability that we observe a 3for the first time on the 8th toss?
Answer: The number of tosses we need is Z ∼ Geom(1/6).P (Z = 8) = qk−1p = (5/6)7(1/6). 2
How many tosses would we expect to take?
Answer: E[Z] = 1/p = 6 tosses. 2
ISYE 6739 — Goldsman 8/5/20 14 / 108
![Page 62: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/62.jpg)
Geometric and Negative Binomial Distributions
Similarly, after a lot of algebra,
E[Z2] =d2
dt2MZ(t)
∣∣∣t=0
=2− pp2
,
and thenVar(Z) = E[Z2]− (E[Z])2 = q/p2. 2
Example: Toss a die repeatedly. What’s the probability that we observe a 3for the first time on the 8th toss?
Answer: The number of tosses we need is Z ∼ Geom(1/6).P (Z = 8) = qk−1p = (5/6)7(1/6). 2
How many tosses would we expect to take?
Answer: E[Z] = 1/p = 6 tosses. 2
ISYE 6739 — Goldsman 8/5/20 14 / 108
![Page 63: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/63.jpg)
Geometric and Negative Binomial Distributions
Memoryless Property of Geometric
Theorem: Suppose Z ∼ Geom(p). Then for positive integers s, t, we have
P (Z > s+ t|Z > s) = P (Z > t).
Why is it called the Memoryless Property? If an event hasn’t occurred by times, the probability that it will occur after an additional t time units is the sameas the (unconditional) probability that it will occur after time t — it forgot thatit made it past time s!
ISYE 6739 — Goldsman 8/5/20 15 / 108
![Page 64: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/64.jpg)
Geometric and Negative Binomial Distributions
Memoryless Property of Geometric
Theorem: Suppose Z ∼ Geom(p). Then for positive integers s, t, we have
P (Z > s+ t|Z > s) = P (Z > t).
Why is it called the Memoryless Property? If an event hasn’t occurred by times, the probability that it will occur after an additional t time units is the sameas the (unconditional) probability that it will occur after time t — it forgot thatit made it past time s!
ISYE 6739 — Goldsman 8/5/20 15 / 108
![Page 65: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/65.jpg)
Geometric and Negative Binomial Distributions
Memoryless Property of Geometric
Theorem: Suppose Z ∼ Geom(p). Then for positive integers s, t, we have
P (Z > s+ t|Z > s) = P (Z > t).
Why is it called the Memoryless Property?
If an event hasn’t occurred by times, the probability that it will occur after an additional t time units is the sameas the (unconditional) probability that it will occur after time t — it forgot thatit made it past time s!
ISYE 6739 — Goldsman 8/5/20 15 / 108
![Page 66: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/66.jpg)
Geometric and Negative Binomial Distributions
Memoryless Property of Geometric
Theorem: Suppose Z ∼ Geom(p). Then for positive integers s, t, we have
P (Z > s+ t|Z > s) = P (Z > t).
Why is it called the Memoryless Property? If an event hasn’t occurred by times, the probability that it will occur after an additional t time units is the sameas the (unconditional) probability that it will occur after time t — it forgot thatit made it past time s!
ISYE 6739 — Goldsman 8/5/20 15 / 108
![Page 67: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/67.jpg)
Geometric and Negative Binomial Distributions
Proof: First of all, for any t = 0, 1, 2, . . ., the tail probability is
P (Z > t) = P (t Bern(p) failures in a row) = qt. (∗)
Then
P (Z > s+ t|Z > s) =P (Z > s+ t ∩ Z > s)
P (Z > s)
=P (Z > s+ t)
P (Z > s)
=qs+t
qs(by (∗))
= qt
= P (Z > t). 2
ISYE 6739 — Goldsman 8/5/20 16 / 108
![Page 68: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/68.jpg)
Geometric and Negative Binomial Distributions
Proof: First of all, for any t = 0, 1, 2, . . ., the tail probability is
P (Z > t) = P (t Bern(p) failures in a row) = qt. (∗)
Then
P (Z > s+ t|Z > s)
=P (Z > s+ t ∩ Z > s)
P (Z > s)
=P (Z > s+ t)
P (Z > s)
=qs+t
qs(by (∗))
= qt
= P (Z > t). 2
ISYE 6739 — Goldsman 8/5/20 16 / 108
![Page 69: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/69.jpg)
Geometric and Negative Binomial Distributions
Proof: First of all, for any t = 0, 1, 2, . . ., the tail probability is
P (Z > t) = P (t Bern(p) failures in a row) = qt. (∗)
Then
P (Z > s+ t|Z > s) =P (Z > s+ t ∩ Z > s)
P (Z > s)
=P (Z > s+ t)
P (Z > s)
=qs+t
qs(by (∗))
= qt
= P (Z > t). 2
ISYE 6739 — Goldsman 8/5/20 16 / 108
![Page 70: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/70.jpg)
Geometric and Negative Binomial Distributions
Proof: First of all, for any t = 0, 1, 2, . . ., the tail probability is
P (Z > t) = P (t Bern(p) failures in a row) = qt. (∗)
Then
P (Z > s+ t|Z > s) =P (Z > s+ t ∩ Z > s)
P (Z > s)
=P (Z > s+ t)
P (Z > s)
=qs+t
qs(by (∗))
= qt
= P (Z > t). 2
ISYE 6739 — Goldsman 8/5/20 16 / 108
![Page 71: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/71.jpg)
Geometric and Negative Binomial Distributions
Proof: First of all, for any t = 0, 1, 2, . . ., the tail probability is
P (Z > t) = P (t Bern(p) failures in a row) = qt. (∗)
Then
P (Z > s+ t|Z > s) =P (Z > s+ t ∩ Z > s)
P (Z > s)
=P (Z > s+ t)
P (Z > s)
=qs+t
qs(by (∗))
= qt
= P (Z > t). 2
ISYE 6739 — Goldsman 8/5/20 16 / 108
![Page 72: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/72.jpg)
Geometric and Negative Binomial Distributions
Proof: First of all, for any t = 0, 1, 2, . . ., the tail probability is
P (Z > t) = P (t Bern(p) failures in a row) = qt. (∗)
Then
P (Z > s+ t|Z > s) =P (Z > s+ t ∩ Z > s)
P (Z > s)
=P (Z > s+ t)
P (Z > s)
=qs+t
qs(by (∗))
= qt
= P (Z > t). 2
ISYE 6739 — Goldsman 8/5/20 16 / 108
![Page 73: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/73.jpg)
Geometric and Negative Binomial Distributions
Proof: First of all, for any t = 0, 1, 2, . . ., the tail probability is
P (Z > t) = P (t Bern(p) failures in a row) = qt. (∗)
Then
P (Z > s+ t|Z > s) =P (Z > s+ t ∩ Z > s)
P (Z > s)
=P (Z > s+ t)
P (Z > s)
=qs+t
qs(by (∗))
= qt
= P (Z > t). 2
ISYE 6739 — Goldsman 8/5/20 16 / 108
![Page 74: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/74.jpg)
Geometric and Negative Binomial Distributions
Example: Let’s toss a die until a 5 appears for the first time.
Suppose thatwe’ve already made 4 tosses without success. What’s the probability thatwe’ll need more than 2 additional tosses before we observe a 5?
Let Z be the number of tosses required. By the Memoryless Property (withs = 4 and t = 2) and (∗), we want
P (Z > 6|Z > 4) = P (Z > 2) = (5/6)2. 2
Fun Fact: The Geom(p) is the only discrete distribution with thememoryless property.
Not-as-Fun Fact: Some books define the Geom(p) as the number ofBern(p) failures until you observe a success. # failures = # trials − 1. Youshould be aware of this inconsistency, but don’t worry about it now.
ISYE 6739 — Goldsman 8/5/20 17 / 108
![Page 75: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/75.jpg)
Geometric and Negative Binomial Distributions
Example: Let’s toss a die until a 5 appears for the first time. Suppose thatwe’ve already made 4 tosses without success.
What’s the probability thatwe’ll need more than 2 additional tosses before we observe a 5?
Let Z be the number of tosses required. By the Memoryless Property (withs = 4 and t = 2) and (∗), we want
P (Z > 6|Z > 4) = P (Z > 2) = (5/6)2. 2
Fun Fact: The Geom(p) is the only discrete distribution with thememoryless property.
Not-as-Fun Fact: Some books define the Geom(p) as the number ofBern(p) failures until you observe a success. # failures = # trials − 1. Youshould be aware of this inconsistency, but don’t worry about it now.
ISYE 6739 — Goldsman 8/5/20 17 / 108
![Page 76: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/76.jpg)
Geometric and Negative Binomial Distributions
Example: Let’s toss a die until a 5 appears for the first time. Suppose thatwe’ve already made 4 tosses without success. What’s the probability thatwe’ll need more than 2 additional tosses before we observe a 5?
Let Z be the number of tosses required. By the Memoryless Property (withs = 4 and t = 2) and (∗), we want
P (Z > 6|Z > 4) = P (Z > 2) = (5/6)2. 2
Fun Fact: The Geom(p) is the only discrete distribution with thememoryless property.
Not-as-Fun Fact: Some books define the Geom(p) as the number ofBern(p) failures until you observe a success. # failures = # trials − 1. Youshould be aware of this inconsistency, but don’t worry about it now.
ISYE 6739 — Goldsman 8/5/20 17 / 108
![Page 77: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/77.jpg)
Geometric and Negative Binomial Distributions
Example: Let’s toss a die until a 5 appears for the first time. Suppose thatwe’ve already made 4 tosses without success. What’s the probability thatwe’ll need more than 2 additional tosses before we observe a 5?
Let Z be the number of tosses required.
By the Memoryless Property (withs = 4 and t = 2) and (∗), we want
P (Z > 6|Z > 4) = P (Z > 2) = (5/6)2. 2
Fun Fact: The Geom(p) is the only discrete distribution with thememoryless property.
Not-as-Fun Fact: Some books define the Geom(p) as the number ofBern(p) failures until you observe a success. # failures = # trials − 1. Youshould be aware of this inconsistency, but don’t worry about it now.
ISYE 6739 — Goldsman 8/5/20 17 / 108
![Page 78: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/78.jpg)
Geometric and Negative Binomial Distributions
Example: Let’s toss a die until a 5 appears for the first time. Suppose thatwe’ve already made 4 tosses without success. What’s the probability thatwe’ll need more than 2 additional tosses before we observe a 5?
Let Z be the number of tosses required. By the Memoryless Property (withs = 4 and t = 2) and (∗), we want
P (Z > 6|Z > 4) = P (Z > 2) = (5/6)2. 2
Fun Fact: The Geom(p) is the only discrete distribution with thememoryless property.
Not-as-Fun Fact: Some books define the Geom(p) as the number ofBern(p) failures until you observe a success. # failures = # trials − 1. Youshould be aware of this inconsistency, but don’t worry about it now.
ISYE 6739 — Goldsman 8/5/20 17 / 108
![Page 79: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/79.jpg)
Geometric and Negative Binomial Distributions
Example: Let’s toss a die until a 5 appears for the first time. Suppose thatwe’ve already made 4 tosses without success. What’s the probability thatwe’ll need more than 2 additional tosses before we observe a 5?
Let Z be the number of tosses required. By the Memoryless Property (withs = 4 and t = 2) and (∗), we want
P (Z > 6|Z > 4) = P (Z > 2) = (5/6)2. 2
Fun Fact: The Geom(p) is the only discrete distribution with thememoryless property.
Not-as-Fun Fact: Some books define the Geom(p) as the number ofBern(p) failures until you observe a success. # failures = # trials − 1. Youshould be aware of this inconsistency, but don’t worry about it now.
ISYE 6739 — Goldsman 8/5/20 17 / 108
![Page 80: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/80.jpg)
Geometric and Negative Binomial Distributions
Example: Let’s toss a die until a 5 appears for the first time. Suppose thatwe’ve already made 4 tosses without success. What’s the probability thatwe’ll need more than 2 additional tosses before we observe a 5?
Let Z be the number of tosses required. By the Memoryless Property (withs = 4 and t = 2) and (∗), we want
P (Z > 6|Z > 4) = P (Z > 2) = (5/6)2. 2
Fun Fact: The Geom(p) is the only discrete distribution with thememoryless property.
Not-as-Fun Fact: Some books define the Geom(p) as the number ofBern(p) failures until you observe a success. # failures = # trials − 1. Youshould be aware of this inconsistency, but don’t worry about it now.
ISYE 6739 — Goldsman 8/5/20 17 / 108
![Page 81: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/81.jpg)
Geometric and Negative Binomial Distributions
Example: Let’s toss a die until a 5 appears for the first time. Suppose thatwe’ve already made 4 tosses without success. What’s the probability thatwe’ll need more than 2 additional tosses before we observe a 5?
Let Z be the number of tosses required. By the Memoryless Property (withs = 4 and t = 2) and (∗), we want
P (Z > 6|Z > 4) = P (Z > 2) = (5/6)2. 2
Fun Fact: The Geom(p) is the only discrete distribution with thememoryless property.
Not-as-Fun Fact: Some books define the Geom(p) as the number ofBern(p) failures until you observe a success. # failures = # trials − 1. Youshould be aware of this inconsistency, but don’t worry about it now.
ISYE 6739 — Goldsman 8/5/20 17 / 108
![Page 82: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/82.jpg)
Geometric and Negative Binomial Distributions
Definition: Suppose we consider an infinite sequence of independentBern(p) trials.
Now let W equal the number of trials until the rth success is obtained.W = r, r + 1, . . .. The event W = k corresponds to exactly r − 1 successesby time k − 1, and then the rth success at time k.
We say that W has the Negative Binomial distribution (aka the Pascaldistribution) with parameters r and p.
Example: ‘FFFFSFS’ corresponds to W = 7 trials until the r = 2nd success.
Notation: W ∼ NegBin(r, p).
Remark: As with the Geom(p), the exact definition of the NegBin dependson what book you’re reading.
ISYE 6739 — Goldsman 8/5/20 18 / 108
![Page 83: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/83.jpg)
Geometric and Negative Binomial Distributions
Definition: Suppose we consider an infinite sequence of independentBern(p) trials.
Now let W equal the number of trials until the rth success is obtained.W = r, r + 1, . . ..
The event W = k corresponds to exactly r − 1 successesby time k − 1, and then the rth success at time k.
We say that W has the Negative Binomial distribution (aka the Pascaldistribution) with parameters r and p.
Example: ‘FFFFSFS’ corresponds to W = 7 trials until the r = 2nd success.
Notation: W ∼ NegBin(r, p).
Remark: As with the Geom(p), the exact definition of the NegBin dependson what book you’re reading.
ISYE 6739 — Goldsman 8/5/20 18 / 108
![Page 84: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/84.jpg)
Geometric and Negative Binomial Distributions
Definition: Suppose we consider an infinite sequence of independentBern(p) trials.
Now let W equal the number of trials until the rth success is obtained.W = r, r + 1, . . .. The event W = k corresponds to exactly r − 1 successesby time k − 1, and then the rth success at time k.
We say that W has the Negative Binomial distribution (aka the Pascaldistribution) with parameters r and p.
Example: ‘FFFFSFS’ corresponds to W = 7 trials until the r = 2nd success.
Notation: W ∼ NegBin(r, p).
Remark: As with the Geom(p), the exact definition of the NegBin dependson what book you’re reading.
ISYE 6739 — Goldsman 8/5/20 18 / 108
![Page 85: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/85.jpg)
Geometric and Negative Binomial Distributions
Definition: Suppose we consider an infinite sequence of independentBern(p) trials.
Now let W equal the number of trials until the rth success is obtained.W = r, r + 1, . . .. The event W = k corresponds to exactly r − 1 successesby time k − 1, and then the rth success at time k.
We say that W has the Negative Binomial distribution (aka the Pascaldistribution) with parameters r and p.
Example: ‘FFFFSFS’ corresponds to W = 7 trials until the r = 2nd success.
Notation: W ∼ NegBin(r, p).
Remark: As with the Geom(p), the exact definition of the NegBin dependson what book you’re reading.
ISYE 6739 — Goldsman 8/5/20 18 / 108
![Page 86: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/86.jpg)
Geometric and Negative Binomial Distributions
Definition: Suppose we consider an infinite sequence of independentBern(p) trials.
Now let W equal the number of trials until the rth success is obtained.W = r, r + 1, . . .. The event W = k corresponds to exactly r − 1 successesby time k − 1, and then the rth success at time k.
We say that W has the Negative Binomial distribution (aka the Pascaldistribution) with parameters r and p.
Example: ‘FFFFSFS’ corresponds to W = 7 trials until the r = 2nd success.
Notation: W ∼ NegBin(r, p).
Remark: As with the Geom(p), the exact definition of the NegBin dependson what book you’re reading.
ISYE 6739 — Goldsman 8/5/20 18 / 108
![Page 87: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/87.jpg)
Geometric and Negative Binomial Distributions
Definition: Suppose we consider an infinite sequence of independentBern(p) trials.
Now let W equal the number of trials until the rth success is obtained.W = r, r + 1, . . .. The event W = k corresponds to exactly r − 1 successesby time k − 1, and then the rth success at time k.
We say that W has the Negative Binomial distribution (aka the Pascaldistribution) with parameters r and p.
Example: ‘FFFFSFS’ corresponds to W = 7 trials until the r = 2nd success.
Notation: W ∼ NegBin(r, p).
Remark: As with the Geom(p), the exact definition of the NegBin dependson what book you’re reading.
ISYE 6739 — Goldsman 8/5/20 18 / 108
![Page 88: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/88.jpg)
Geometric and Negative Binomial Distributions
Definition: Suppose we consider an infinite sequence of independentBern(p) trials.
Now let W equal the number of trials until the rth success is obtained.W = r, r + 1, . . .. The event W = k corresponds to exactly r − 1 successesby time k − 1, and then the rth success at time k.
We say that W has the Negative Binomial distribution (aka the Pascaldistribution) with parameters r and p.
Example: ‘FFFFSFS’ corresponds to W = 7 trials until the r = 2nd success.
Notation: W ∼ NegBin(r, p).
Remark: As with the Geom(p), the exact definition of the NegBin dependson what book you’re reading.
ISYE 6739 — Goldsman 8/5/20 18 / 108
![Page 89: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/89.jpg)
Geometric and Negative Binomial Distributions
Theorem: If Z1, . . . , Zriid∼ Geom(p), then W =
∑ri=1 Zi ∼ NegBin(r, p).
In other words, Geometrics add up to a NegBin.
Proof: Won’t do it here, but you can use the mgf technique. 2
Anyhow, it makes sense if you think of Zi as the number of trials after the(i− 1)st success up to and including the ith success.
Since the Zi’s are i.i.d., the above theorem gives:
E[W ] = rE[Zi] = r/p,
Var(W ) = rVar(Zi) = rq/p2,
MW (t) = [MZi(t)]r =
(pet
1− qet
)r.
ISYE 6739 — Goldsman 8/5/20 19 / 108
![Page 90: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/90.jpg)
Geometric and Negative Binomial Distributions
Theorem: If Z1, . . . , Zriid∼ Geom(p), then W =
∑ri=1 Zi ∼ NegBin(r, p).
In other words, Geometrics add up to a NegBin.
Proof: Won’t do it here, but you can use the mgf technique. 2
Anyhow, it makes sense if you think of Zi as the number of trials after the(i− 1)st success up to and including the ith success.
Since the Zi’s are i.i.d., the above theorem gives:
E[W ] = rE[Zi] = r/p,
Var(W ) = rVar(Zi) = rq/p2,
MW (t) = [MZi(t)]r =
(pet
1− qet
)r.
ISYE 6739 — Goldsman 8/5/20 19 / 108
![Page 91: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/91.jpg)
Geometric and Negative Binomial Distributions
Theorem: If Z1, . . . , Zriid∼ Geom(p), then W =
∑ri=1 Zi ∼ NegBin(r, p).
In other words, Geometrics add up to a NegBin.
Proof: Won’t do it here, but you can use the mgf technique. 2
Anyhow, it makes sense if you think of Zi as the number of trials after the(i− 1)st success up to and including the ith success.
Since the Zi’s are i.i.d., the above theorem gives:
E[W ] = rE[Zi] = r/p,
Var(W ) = rVar(Zi) = rq/p2,
MW (t) = [MZi(t)]r =
(pet
1− qet
)r.
ISYE 6739 — Goldsman 8/5/20 19 / 108
![Page 92: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/92.jpg)
Geometric and Negative Binomial Distributions
Theorem: If Z1, . . . , Zriid∼ Geom(p), then W =
∑ri=1 Zi ∼ NegBin(r, p).
In other words, Geometrics add up to a NegBin.
Proof: Won’t do it here, but you can use the mgf technique. 2
Anyhow, it makes sense if you think of Zi as the number of trials after the(i− 1)st success up to and including the ith success.
Since the Zi’s are i.i.d., the above theorem gives:
E[W ] = rE[Zi] = r/p,
Var(W ) = rVar(Zi) = rq/p2,
MW (t) = [MZi(t)]r =
(pet
1− qet
)r.
ISYE 6739 — Goldsman 8/5/20 19 / 108
![Page 93: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/93.jpg)
Geometric and Negative Binomial Distributions
Theorem: If Z1, . . . , Zriid∼ Geom(p), then W =
∑ri=1 Zi ∼ NegBin(r, p).
In other words, Geometrics add up to a NegBin.
Proof: Won’t do it here, but you can use the mgf technique. 2
Anyhow, it makes sense if you think of Zi as the number of trials after the(i− 1)st success up to and including the ith success.
Since the Zi’s are i.i.d., the above theorem gives:
E[W ] = rE[Zi] = r/p,
Var(W ) = rVar(Zi) = rq/p2,
MW (t) = [MZi(t)]r =
(pet
1− qet
)r.
ISYE 6739 — Goldsman 8/5/20 19 / 108
![Page 94: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/94.jpg)
Geometric and Negative Binomial Distributions
Theorem: If Z1, . . . , Zriid∼ Geom(p), then W =
∑ri=1 Zi ∼ NegBin(r, p).
In other words, Geometrics add up to a NegBin.
Proof: Won’t do it here, but you can use the mgf technique. 2
Anyhow, it makes sense if you think of Zi as the number of trials after the(i− 1)st success up to and including the ith success.
Since the Zi’s are i.i.d., the above theorem gives:
E[W ] = rE[Zi] = r/p,
Var(W ) = rVar(Zi) = rq/p2,
MW (t) = [MZi(t)]r =
(pet
1− qet
)r.
ISYE 6739 — Goldsman 8/5/20 19 / 108
![Page 95: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/95.jpg)
Geometric and Negative Binomial Distributions
Theorem: If Z1, . . . , Zriid∼ Geom(p), then W =
∑ri=1 Zi ∼ NegBin(r, p).
In other words, Geometrics add up to a NegBin.
Proof: Won’t do it here, but you can use the mgf technique. 2
Anyhow, it makes sense if you think of Zi as the number of trials after the(i− 1)st success up to and including the ith success.
Since the Zi’s are i.i.d., the above theorem gives:
E[W ] = rE[Zi] = r/p,
Var(W ) = rVar(Zi) = rq/p2,
MW (t) = [MZi(t)]r =
(pet
1− qet
)r.
ISYE 6739 — Goldsman 8/5/20 19 / 108
![Page 96: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/96.jpg)
Geometric and Negative Binomial Distributions
Theorem: If Z1, . . . , Zriid∼ Geom(p), then W =
∑ri=1 Zi ∼ NegBin(r, p).
In other words, Geometrics add up to a NegBin.
Proof: Won’t do it here, but you can use the mgf technique. 2
Anyhow, it makes sense if you think of Zi as the number of trials after the(i− 1)st success up to and including the ith success.
Since the Zi’s are i.i.d., the above theorem gives:
E[W ] = rE[Zi] = r/p,
Var(W ) = rVar(Zi) = rq/p2,
MW (t) = [MZi(t)]r =
(pet
1− qet
)r.
ISYE 6739 — Goldsman 8/5/20 19 / 108
![Page 97: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/97.jpg)
Geometric and Negative Binomial Distributions
Just to be complete, let’s get the pmf of W .
Note that W = k iff you getexactly r − 1 successes by time k − 1, and then the rth success at time k.So. . .
P (W = k) =
[(k − 1
r − 1
)pr−1qk−r
]p =
(k − 1
r − 1
)prqk−r, k = r, r + 1, . . .
Example: Toss a die until a 5 appears for the third time. What’s theprobability that we’ll need exactly 7 tosses?
Let W be the number of tosses required. Clearly, W ∼ NegBin(3, 1/6).
P (W = 7) =
(7− 1
3− 1
)(1/6)3(5/6)7−3.
ISYE 6739 — Goldsman 8/5/20 20 / 108
![Page 98: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/98.jpg)
Geometric and Negative Binomial Distributions
Just to be complete, let’s get the pmf of W . Note that W = k iff you getexactly r − 1 successes by time k − 1, and then the rth success at time k.So. . .
P (W = k) =
[(k − 1
r − 1
)pr−1qk−r
]p =
(k − 1
r − 1
)prqk−r, k = r, r + 1, . . .
Example: Toss a die until a 5 appears for the third time. What’s theprobability that we’ll need exactly 7 tosses?
Let W be the number of tosses required. Clearly, W ∼ NegBin(3, 1/6).
P (W = 7) =
(7− 1
3− 1
)(1/6)3(5/6)7−3.
ISYE 6739 — Goldsman 8/5/20 20 / 108
![Page 99: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/99.jpg)
Geometric and Negative Binomial Distributions
Just to be complete, let’s get the pmf of W . Note that W = k iff you getexactly r − 1 successes by time k − 1, and then the rth success at time k.So. . .
P (W = k) =
[(k − 1
r − 1
)pr−1qk−r
]p
=
(k − 1
r − 1
)prqk−r, k = r, r + 1, . . .
Example: Toss a die until a 5 appears for the third time. What’s theprobability that we’ll need exactly 7 tosses?
Let W be the number of tosses required. Clearly, W ∼ NegBin(3, 1/6).
P (W = 7) =
(7− 1
3− 1
)(1/6)3(5/6)7−3.
ISYE 6739 — Goldsman 8/5/20 20 / 108
![Page 100: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/100.jpg)
Geometric and Negative Binomial Distributions
Just to be complete, let’s get the pmf of W . Note that W = k iff you getexactly r − 1 successes by time k − 1, and then the rth success at time k.So. . .
P (W = k) =
[(k − 1
r − 1
)pr−1qk−r
]p =
(k − 1
r − 1
)prqk−r, k = r, r + 1, . . .
Example: Toss a die until a 5 appears for the third time. What’s theprobability that we’ll need exactly 7 tosses?
Let W be the number of tosses required. Clearly, W ∼ NegBin(3, 1/6).
P (W = 7) =
(7− 1
3− 1
)(1/6)3(5/6)7−3.
ISYE 6739 — Goldsman 8/5/20 20 / 108
![Page 101: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/101.jpg)
Geometric and Negative Binomial Distributions
Just to be complete, let’s get the pmf of W . Note that W = k iff you getexactly r − 1 successes by time k − 1, and then the rth success at time k.So. . .
P (W = k) =
[(k − 1
r − 1
)pr−1qk−r
]p =
(k − 1
r − 1
)prqk−r, k = r, r + 1, . . .
Example: Toss a die until a 5 appears for the third time.
What’s theprobability that we’ll need exactly 7 tosses?
Let W be the number of tosses required. Clearly, W ∼ NegBin(3, 1/6).
P (W = 7) =
(7− 1
3− 1
)(1/6)3(5/6)7−3.
ISYE 6739 — Goldsman 8/5/20 20 / 108
![Page 102: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/102.jpg)
Geometric and Negative Binomial Distributions
Just to be complete, let’s get the pmf of W . Note that W = k iff you getexactly r − 1 successes by time k − 1, and then the rth success at time k.So. . .
P (W = k) =
[(k − 1
r − 1
)pr−1qk−r
]p =
(k − 1
r − 1
)prqk−r, k = r, r + 1, . . .
Example: Toss a die until a 5 appears for the third time. What’s theprobability that we’ll need exactly 7 tosses?
Let W be the number of tosses required. Clearly, W ∼ NegBin(3, 1/6).
P (W = 7) =
(7− 1
3− 1
)(1/6)3(5/6)7−3.
ISYE 6739 — Goldsman 8/5/20 20 / 108
![Page 103: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/103.jpg)
Geometric and Negative Binomial Distributions
Just to be complete, let’s get the pmf of W . Note that W = k iff you getexactly r − 1 successes by time k − 1, and then the rth success at time k.So. . .
P (W = k) =
[(k − 1
r − 1
)pr−1qk−r
]p =
(k − 1
r − 1
)prqk−r, k = r, r + 1, . . .
Example: Toss a die until a 5 appears for the third time. What’s theprobability that we’ll need exactly 7 tosses?
Let W be the number of tosses required. Clearly, W ∼ NegBin(3, 1/6).
P (W = 7) =
(7− 1
3− 1
)(1/6)3(5/6)7−3.
ISYE 6739 — Goldsman 8/5/20 20 / 108
![Page 104: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/104.jpg)
Geometric and Negative Binomial Distributions
Just to be complete, let’s get the pmf of W . Note that W = k iff you getexactly r − 1 successes by time k − 1, and then the rth success at time k.So. . .
P (W = k) =
[(k − 1
r − 1
)pr−1qk−r
]p =
(k − 1
r − 1
)prqk−r, k = r, r + 1, . . .
Example: Toss a die until a 5 appears for the third time. What’s theprobability that we’ll need exactly 7 tosses?
Let W be the number of tosses required. Clearly, W ∼ NegBin(3, 1/6).
P (W = 7) =
(7− 1
3− 1
)(1/6)3(5/6)7−3.
ISYE 6739 — Goldsman 8/5/20 20 / 108
![Page 105: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/105.jpg)
Geometric and Negative Binomial Distributions
How are the Binomial and NegBin Related?
X1, . . . , Xniid∼ Bern(p) ⇒ Y ≡
∑ni=1Xi ∼ Bin(n, p).
Z1, . . . , Zriid∼ Geom(p) ⇒ W ≡
∑ri=1 Zi ∼ NegBin(r, p).
E[Y ] = np, Var(Y ) = npq.
E[W ] = r/p, Var(W ) = rq/p2.
ISYE 6739 — Goldsman 8/5/20 21 / 108
![Page 106: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/106.jpg)
Geometric and Negative Binomial Distributions
How are the Binomial and NegBin Related?
X1, . . . , Xniid∼ Bern(p) ⇒ Y ≡
∑ni=1Xi ∼ Bin(n, p).
Z1, . . . , Zriid∼ Geom(p) ⇒ W ≡
∑ri=1 Zi ∼ NegBin(r, p).
E[Y ] = np, Var(Y ) = npq.
E[W ] = r/p, Var(W ) = rq/p2.
ISYE 6739 — Goldsman 8/5/20 21 / 108
![Page 107: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/107.jpg)
Geometric and Negative Binomial Distributions
How are the Binomial and NegBin Related?
X1, . . . , Xniid∼ Bern(p) ⇒ Y ≡
∑ni=1Xi ∼ Bin(n, p).
Z1, . . . , Zriid∼ Geom(p) ⇒ W ≡
∑ri=1 Zi ∼ NegBin(r, p).
E[Y ] = np, Var(Y ) = npq.
E[W ] = r/p, Var(W ) = rq/p2.
ISYE 6739 — Goldsman 8/5/20 21 / 108
![Page 108: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/108.jpg)
Geometric and Negative Binomial Distributions
How are the Binomial and NegBin Related?
X1, . . . , Xniid∼ Bern(p) ⇒ Y ≡
∑ni=1Xi ∼ Bin(n, p).
Z1, . . . , Zriid∼ Geom(p) ⇒ W ≡
∑ri=1 Zi ∼ NegBin(r, p).
E[Y ] = np, Var(Y ) = npq.
E[W ] = r/p, Var(W ) = rq/p2.
ISYE 6739 — Goldsman 8/5/20 21 / 108
![Page 109: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/109.jpg)
Geometric and Negative Binomial Distributions
How are the Binomial and NegBin Related?
X1, . . . , Xniid∼ Bern(p) ⇒ Y ≡
∑ni=1Xi ∼ Bin(n, p).
Z1, . . . , Zriid∼ Geom(p) ⇒ W ≡
∑ri=1 Zi ∼ NegBin(r, p).
E[Y ] = np, Var(Y ) = npq.
E[W ] = r/p, Var(W ) = rq/p2.
ISYE 6739 — Goldsman 8/5/20 21 / 108
![Page 110: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/110.jpg)
Poisson Distribution
1 Bernoulli and Binomial Distributions
2 Hypergeometric Distribution3 Geometric and Negative Binomial Distributions
4 Poisson Distribution5 Uniform, Exponential, and Friends6 Other Continuous Distributions7 Normal Distribution: Basics8 Standard Normal Distribution9 Sample Mean of Normals
10 The Central Limit Theorem + Proof
11 Central Limit Theorem Examples
12 Extensions — Multivariate Normal Distribution13 Extensions — Lognormal Distribution
14 Computer Stuff
ISYE 6739 — Goldsman 8/5/20 22 / 108
![Page 111: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/111.jpg)
Poisson Distribution
Lesson 4.4 — Poisson Distribution
We’ll first talk about Poisson processes.
Let N(t) be a counting process. That is, N(t) is the number ofoccurrences (or arrivals, or events) of some process over the time interval[0, t]. N(t) looks like a step function.
Examples: N(t) could be any of the following.(a) Cars entering a shopping center (by time t).(b) Defects on a wire (of length t).(c) Raisins in cookie dough (of volume t).
Let λ > 0 be the average number of occurrences per unit time (or length orvolume).
In the above examples, we might have:(a) λ = 10/min. (b) λ = 0.5/ft. (c) λ = 4/in3.
ISYE 6739 — Goldsman 8/5/20 23 / 108
![Page 112: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/112.jpg)
Poisson Distribution
Lesson 4.4 — Poisson Distribution
We’ll first talk about Poisson processes.
Let N(t) be a counting process. That is, N(t) is the number ofoccurrences (or arrivals, or events) of some process over the time interval[0, t]. N(t) looks like a step function.
Examples: N(t) could be any of the following.(a) Cars entering a shopping center (by time t).(b) Defects on a wire (of length t).(c) Raisins in cookie dough (of volume t).
Let λ > 0 be the average number of occurrences per unit time (or length orvolume).
In the above examples, we might have:(a) λ = 10/min. (b) λ = 0.5/ft. (c) λ = 4/in3.
ISYE 6739 — Goldsman 8/5/20 23 / 108
![Page 113: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/113.jpg)
Poisson Distribution
Lesson 4.4 — Poisson Distribution
We’ll first talk about Poisson processes.
Let N(t) be a counting process. That is, N(t) is the number ofoccurrences (or arrivals, or events) of some process over the time interval[0, t]. N(t) looks like a step function.
Examples: N(t) could be any of the following.(a) Cars entering a shopping center (by time t).(b) Defects on a wire (of length t).(c) Raisins in cookie dough (of volume t).
Let λ > 0 be the average number of occurrences per unit time (or length orvolume).
In the above examples, we might have:(a) λ = 10/min. (b) λ = 0.5/ft. (c) λ = 4/in3.
ISYE 6739 — Goldsman 8/5/20 23 / 108
![Page 114: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/114.jpg)
Poisson Distribution
Lesson 4.4 — Poisson Distribution
We’ll first talk about Poisson processes.
Let N(t) be a counting process. That is, N(t) is the number ofoccurrences (or arrivals, or events) of some process over the time interval[0, t]. N(t) looks like a step function.
Examples: N(t) could be any of the following.
(a) Cars entering a shopping center (by time t).(b) Defects on a wire (of length t).(c) Raisins in cookie dough (of volume t).
Let λ > 0 be the average number of occurrences per unit time (or length orvolume).
In the above examples, we might have:(a) λ = 10/min. (b) λ = 0.5/ft. (c) λ = 4/in3.
ISYE 6739 — Goldsman 8/5/20 23 / 108
![Page 115: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/115.jpg)
Poisson Distribution
Lesson 4.4 — Poisson Distribution
We’ll first talk about Poisson processes.
Let N(t) be a counting process. That is, N(t) is the number ofoccurrences (or arrivals, or events) of some process over the time interval[0, t]. N(t) looks like a step function.
Examples: N(t) could be any of the following.(a) Cars entering a shopping center (by time t).
(b) Defects on a wire (of length t).(c) Raisins in cookie dough (of volume t).
Let λ > 0 be the average number of occurrences per unit time (or length orvolume).
In the above examples, we might have:(a) λ = 10/min. (b) λ = 0.5/ft. (c) λ = 4/in3.
ISYE 6739 — Goldsman 8/5/20 23 / 108
![Page 116: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/116.jpg)
Poisson Distribution
Lesson 4.4 — Poisson Distribution
We’ll first talk about Poisson processes.
Let N(t) be a counting process. That is, N(t) is the number ofoccurrences (or arrivals, or events) of some process over the time interval[0, t]. N(t) looks like a step function.
Examples: N(t) could be any of the following.(a) Cars entering a shopping center (by time t).(b) Defects on a wire (of length t).
(c) Raisins in cookie dough (of volume t).
Let λ > 0 be the average number of occurrences per unit time (or length orvolume).
In the above examples, we might have:(a) λ = 10/min. (b) λ = 0.5/ft. (c) λ = 4/in3.
ISYE 6739 — Goldsman 8/5/20 23 / 108
![Page 117: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/117.jpg)
Poisson Distribution
Lesson 4.4 — Poisson Distribution
We’ll first talk about Poisson processes.
Let N(t) be a counting process. That is, N(t) is the number ofoccurrences (or arrivals, or events) of some process over the time interval[0, t]. N(t) looks like a step function.
Examples: N(t) could be any of the following.(a) Cars entering a shopping center (by time t).(b) Defects on a wire (of length t).(c) Raisins in cookie dough (of volume t).
Let λ > 0 be the average number of occurrences per unit time (or length orvolume).
In the above examples, we might have:(a) λ = 10/min. (b) λ = 0.5/ft. (c) λ = 4/in3.
ISYE 6739 — Goldsman 8/5/20 23 / 108
![Page 118: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/118.jpg)
Poisson Distribution
Lesson 4.4 — Poisson Distribution
We’ll first talk about Poisson processes.
Let N(t) be a counting process. That is, N(t) is the number ofoccurrences (or arrivals, or events) of some process over the time interval[0, t]. N(t) looks like a step function.
Examples: N(t) could be any of the following.(a) Cars entering a shopping center (by time t).(b) Defects on a wire (of length t).(c) Raisins in cookie dough (of volume t).
Let λ > 0 be the average number of occurrences per unit time (or length orvolume).
In the above examples, we might have:(a) λ = 10/min. (b) λ = 0.5/ft. (c) λ = 4/in3.
ISYE 6739 — Goldsman 8/5/20 23 / 108
![Page 119: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/119.jpg)
Poisson Distribution
Lesson 4.4 — Poisson Distribution
We’ll first talk about Poisson processes.
Let N(t) be a counting process. That is, N(t) is the number ofoccurrences (or arrivals, or events) of some process over the time interval[0, t]. N(t) looks like a step function.
Examples: N(t) could be any of the following.(a) Cars entering a shopping center (by time t).(b) Defects on a wire (of length t).(c) Raisins in cookie dough (of volume t).
Let λ > 0 be the average number of occurrences per unit time (or length orvolume).
In the above examples, we might have:
(a) λ = 10/min. (b) λ = 0.5/ft. (c) λ = 4/in3.
ISYE 6739 — Goldsman 8/5/20 23 / 108
![Page 120: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/120.jpg)
Poisson Distribution
Lesson 4.4 — Poisson Distribution
We’ll first talk about Poisson processes.
Let N(t) be a counting process. That is, N(t) is the number ofoccurrences (or arrivals, or events) of some process over the time interval[0, t]. N(t) looks like a step function.
Examples: N(t) could be any of the following.(a) Cars entering a shopping center (by time t).(b) Defects on a wire (of length t).(c) Raisins in cookie dough (of volume t).
Let λ > 0 be the average number of occurrences per unit time (or length orvolume).
In the above examples, we might have:(a) λ = 10/min.
(b) λ = 0.5/ft. (c) λ = 4/in3.
ISYE 6739 — Goldsman 8/5/20 23 / 108
![Page 121: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/121.jpg)
Poisson Distribution
Lesson 4.4 — Poisson Distribution
We’ll first talk about Poisson processes.
Let N(t) be a counting process. That is, N(t) is the number ofoccurrences (or arrivals, or events) of some process over the time interval[0, t]. N(t) looks like a step function.
Examples: N(t) could be any of the following.(a) Cars entering a shopping center (by time t).(b) Defects on a wire (of length t).(c) Raisins in cookie dough (of volume t).
Let λ > 0 be the average number of occurrences per unit time (or length orvolume).
In the above examples, we might have:(a) λ = 10/min. (b) λ = 0.5/ft.
(c) λ = 4/in3.
ISYE 6739 — Goldsman 8/5/20 23 / 108
![Page 122: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/122.jpg)
Poisson Distribution
Lesson 4.4 — Poisson Distribution
We’ll first talk about Poisson processes.
Let N(t) be a counting process. That is, N(t) is the number ofoccurrences (or arrivals, or events) of some process over the time interval[0, t]. N(t) looks like a step function.
Examples: N(t) could be any of the following.(a) Cars entering a shopping center (by time t).(b) Defects on a wire (of length t).(c) Raisins in cookie dough (of volume t).
Let λ > 0 be the average number of occurrences per unit time (or length orvolume).
In the above examples, we might have:(a) λ = 10/min. (b) λ = 0.5/ft. (c) λ = 4/in3.
ISYE 6739 — Goldsman 8/5/20 23 / 108
![Page 123: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/123.jpg)
Poisson Distribution
A Poisson process is a specific counting process. . . .
First, some notation: o(h) is a generic function that goes to zero faster than hgoes to zero.
Definition: A Poisson process (PP) is one that satisfies the following:
(i) There is a short enough interval of time h, such that, for all t,
P (N(t+ h)−N(t) = 0) = 1− λh+ o(h)
P (N(t+ h)−N(t) = 1) = λh+ o(h)
P (N(t+ h)−N(t) ≥ 2) = o(h)
(ii) The distribution of the “increment” N(t+ h)−N(t) only depends onthe length h.
(iii) If a < b < c < d, then the two “increments” N(d)−N(c) andN(b)−N(a) are independent RVs.
ISYE 6739 — Goldsman 8/5/20 24 / 108
![Page 124: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/124.jpg)
Poisson Distribution
A Poisson process is a specific counting process. . . .
First, some notation: o(h) is a generic function that goes to zero faster than hgoes to zero.
Definition: A Poisson process (PP) is one that satisfies the following:
(i) There is a short enough interval of time h, such that, for all t,
P (N(t+ h)−N(t) = 0) = 1− λh+ o(h)
P (N(t+ h)−N(t) = 1) = λh+ o(h)
P (N(t+ h)−N(t) ≥ 2) = o(h)
(ii) The distribution of the “increment” N(t+ h)−N(t) only depends onthe length h.
(iii) If a < b < c < d, then the two “increments” N(d)−N(c) andN(b)−N(a) are independent RVs.
ISYE 6739 — Goldsman 8/5/20 24 / 108
![Page 125: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/125.jpg)
Poisson Distribution
A Poisson process is a specific counting process. . . .
First, some notation: o(h) is a generic function that goes to zero faster than hgoes to zero.
Definition: A Poisson process (PP) is one that satisfies the following:
(i) There is a short enough interval of time h, such that, for all t,
P (N(t+ h)−N(t) = 0) = 1− λh+ o(h)
P (N(t+ h)−N(t) = 1) = λh+ o(h)
P (N(t+ h)−N(t) ≥ 2) = o(h)
(ii) The distribution of the “increment” N(t+ h)−N(t) only depends onthe length h.
(iii) If a < b < c < d, then the two “increments” N(d)−N(c) andN(b)−N(a) are independent RVs.
ISYE 6739 — Goldsman 8/5/20 24 / 108
![Page 126: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/126.jpg)
Poisson Distribution
A Poisson process is a specific counting process. . . .
First, some notation: o(h) is a generic function that goes to zero faster than hgoes to zero.
Definition: A Poisson process (PP) is one that satisfies the following:
(i) There is a short enough interval of time h, such that, for all t,
P (N(t+ h)−N(t) = 0) = 1− λh+ o(h)
P (N(t+ h)−N(t) = 1) = λh+ o(h)
P (N(t+ h)−N(t) ≥ 2) = o(h)
(ii) The distribution of the “increment” N(t+ h)−N(t) only depends onthe length h.
(iii) If a < b < c < d, then the two “increments” N(d)−N(c) andN(b)−N(a) are independent RVs.
ISYE 6739 — Goldsman 8/5/20 24 / 108
![Page 127: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/127.jpg)
Poisson Distribution
A Poisson process is a specific counting process. . . .
First, some notation: o(h) is a generic function that goes to zero faster than hgoes to zero.
Definition: A Poisson process (PP) is one that satisfies the following:
(i) There is a short enough interval of time h, such that, for all t,
P (N(t+ h)−N(t) = 0) = 1− λh+ o(h)
P (N(t+ h)−N(t) = 1) = λh+ o(h)
P (N(t+ h)−N(t) ≥ 2) = o(h)
(ii) The distribution of the “increment” N(t+ h)−N(t) only depends onthe length h.
(iii) If a < b < c < d, then the two “increments” N(d)−N(c) andN(b)−N(a) are independent RVs.
ISYE 6739 — Goldsman 8/5/20 24 / 108
![Page 128: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/128.jpg)
Poisson Distribution
A Poisson process is a specific counting process. . . .
First, some notation: o(h) is a generic function that goes to zero faster than hgoes to zero.
Definition: A Poisson process (PP) is one that satisfies the following:
(i) There is a short enough interval of time h, such that, for all t,
P (N(t+ h)−N(t) = 0) = 1− λh+ o(h)
P (N(t+ h)−N(t) = 1) = λh+ o(h)
P (N(t+ h)−N(t) ≥ 2) = o(h)
(ii) The distribution of the “increment” N(t+ h)−N(t) only depends onthe length h.
(iii) If a < b < c < d, then the two “increments” N(d)−N(c) andN(b)−N(a) are independent RVs.
ISYE 6739 — Goldsman 8/5/20 24 / 108
![Page 129: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/129.jpg)
Poisson Distribution
A Poisson process is a specific counting process. . . .
First, some notation: o(h) is a generic function that goes to zero faster than hgoes to zero.
Definition: A Poisson process (PP) is one that satisfies the following:
(i) There is a short enough interval of time h, such that, for all t,
P (N(t+ h)−N(t) = 0) = 1− λh+ o(h)
P (N(t+ h)−N(t) = 1) = λh+ o(h)
P (N(t+ h)−N(t) ≥ 2) = o(h)
(ii) The distribution of the “increment” N(t+ h)−N(t) only depends onthe length h.
(iii) If a < b < c < d, then the two “increments” N(d)−N(c) andN(b)−N(a) are independent RVs.
ISYE 6739 — Goldsman 8/5/20 24 / 108
![Page 130: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/130.jpg)
Poisson Distribution
A Poisson process is a specific counting process. . . .
First, some notation: o(h) is a generic function that goes to zero faster than hgoes to zero.
Definition: A Poisson process (PP) is one that satisfies the following:
(i) There is a short enough interval of time h, such that, for all t,
P (N(t+ h)−N(t) = 0) = 1− λh+ o(h)
P (N(t+ h)−N(t) = 1) = λh+ o(h)
P (N(t+ h)−N(t) ≥ 2) = o(h)
(ii) The distribution of the “increment” N(t+ h)−N(t) only depends onthe length h.
(iii) If a < b < c < d, then the two “increments” N(d)−N(c) andN(b)−N(a) are independent RVs.
ISYE 6739 — Goldsman 8/5/20 24 / 108
![Page 131: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/131.jpg)
Poisson Distribution
A Poisson process is a specific counting process. . . .
First, some notation: o(h) is a generic function that goes to zero faster than hgoes to zero.
Definition: A Poisson process (PP) is one that satisfies the following:
(i) There is a short enough interval of time h, such that, for all t,
P (N(t+ h)−N(t) = 0) = 1− λh+ o(h)
P (N(t+ h)−N(t) = 1) = λh+ o(h)
P (N(t+ h)−N(t) ≥ 2) = o(h)
(ii) The distribution of the “increment” N(t+ h)−N(t) only depends onthe length h.
(iii) If a < b < c < d, then the two “increments” N(d)−N(c) andN(b)−N(a) are independent RVs.
ISYE 6739 — Goldsman 8/5/20 24 / 108
![Page 132: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/132.jpg)
Poisson Distribution
English translation of Poisson process assumptions.
(i) Arrivals basically occur one-at-a-time, and then at rate λ/unit time. (Wemust make sure that λ doesn’t change over time.)
(ii) The arrival pattern is stationary — it doesn’t change over time.
(iii) The numbers of arrivals in two disjoint time intervals are independent
Poisson Process Example: Neutrinos hit a detector. Occurrences are rareenough so that they really do happen one-at-a-time. You never get arrivals ofgroups of neutrinos. Further, the rate doesn’t vary over time, and all arrivalsare independent of each other. 2
Anti-Example: Customers arrive at a restaurant. They show up in groups,not one-at-a-time. The rate varies over the day (more at dinnertime). Arrivalsmay not be independent. This ain’t a Poisson process. 2
ISYE 6739 — Goldsman 8/5/20 25 / 108
![Page 133: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/133.jpg)
Poisson Distribution
English translation of Poisson process assumptions.
(i) Arrivals basically occur one-at-a-time, and then at rate λ/unit time. (Wemust make sure that λ doesn’t change over time.)
(ii) The arrival pattern is stationary — it doesn’t change over time.
(iii) The numbers of arrivals in two disjoint time intervals are independent
Poisson Process Example: Neutrinos hit a detector. Occurrences are rareenough so that they really do happen one-at-a-time. You never get arrivals ofgroups of neutrinos. Further, the rate doesn’t vary over time, and all arrivalsare independent of each other. 2
Anti-Example: Customers arrive at a restaurant. They show up in groups,not one-at-a-time. The rate varies over the day (more at dinnertime). Arrivalsmay not be independent. This ain’t a Poisson process. 2
ISYE 6739 — Goldsman 8/5/20 25 / 108
![Page 134: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/134.jpg)
Poisson Distribution
English translation of Poisson process assumptions.
(i) Arrivals basically occur one-at-a-time, and then at rate λ/unit time. (Wemust make sure that λ doesn’t change over time.)
(ii) The arrival pattern is stationary — it doesn’t change over time.
(iii) The numbers of arrivals in two disjoint time intervals are independent
Poisson Process Example: Neutrinos hit a detector. Occurrences are rareenough so that they really do happen one-at-a-time. You never get arrivals ofgroups of neutrinos. Further, the rate doesn’t vary over time, and all arrivalsare independent of each other. 2
Anti-Example: Customers arrive at a restaurant. They show up in groups,not one-at-a-time. The rate varies over the day (more at dinnertime). Arrivalsmay not be independent. This ain’t a Poisson process. 2
ISYE 6739 — Goldsman 8/5/20 25 / 108
![Page 135: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/135.jpg)
Poisson Distribution
English translation of Poisson process assumptions.
(i) Arrivals basically occur one-at-a-time, and then at rate λ/unit time. (Wemust make sure that λ doesn’t change over time.)
(ii) The arrival pattern is stationary — it doesn’t change over time.
(iii) The numbers of arrivals in two disjoint time intervals are independent
Poisson Process Example: Neutrinos hit a detector. Occurrences are rareenough so that they really do happen one-at-a-time. You never get arrivals ofgroups of neutrinos. Further, the rate doesn’t vary over time, and all arrivalsare independent of each other. 2
Anti-Example: Customers arrive at a restaurant. They show up in groups,not one-at-a-time. The rate varies over the day (more at dinnertime). Arrivalsmay not be independent. This ain’t a Poisson process. 2
ISYE 6739 — Goldsman 8/5/20 25 / 108
![Page 136: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/136.jpg)
Poisson Distribution
English translation of Poisson process assumptions.
(i) Arrivals basically occur one-at-a-time, and then at rate λ/unit time. (Wemust make sure that λ doesn’t change over time.)
(ii) The arrival pattern is stationary — it doesn’t change over time.
(iii) The numbers of arrivals in two disjoint time intervals are independent
Poisson Process Example: Neutrinos hit a detector. Occurrences are rareenough so that they really do happen one-at-a-time. You never get arrivals ofgroups of neutrinos. Further, the rate doesn’t vary over time, and all arrivalsare independent of each other. 2
Anti-Example: Customers arrive at a restaurant. They show up in groups,not one-at-a-time. The rate varies over the day (more at dinnertime). Arrivalsmay not be independent. This ain’t a Poisson process. 2
ISYE 6739 — Goldsman 8/5/20 25 / 108
![Page 137: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/137.jpg)
Poisson Distribution
English translation of Poisson process assumptions.
(i) Arrivals basically occur one-at-a-time, and then at rate λ/unit time. (Wemust make sure that λ doesn’t change over time.)
(ii) The arrival pattern is stationary — it doesn’t change over time.
(iii) The numbers of arrivals in two disjoint time intervals are independent
Poisson Process Example: Neutrinos hit a detector. Occurrences are rareenough so that they really do happen one-at-a-time. You never get arrivals ofgroups of neutrinos. Further, the rate doesn’t vary over time, and all arrivalsare independent of each other. 2
Anti-Example: Customers arrive at a restaurant. They show up in groups,not one-at-a-time. The rate varies over the day (more at dinnertime). Arrivalsmay not be independent. This ain’t a Poisson process. 2
ISYE 6739 — Goldsman 8/5/20 25 / 108
![Page 138: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/138.jpg)
Poisson Distribution
Definition: Let X be the number of occurrences in a Poisson(λ) process in aunit interval of time.
Then X has the Poisson distribution with parameterλ.
Notation: X ∼ Pois(λ).
Theorem/Definition: X ∼ Pois(λ)⇒ P (X = k) = e−λλk/k!,k = 0, 1, 2, . . ..
Proof: The proof follows from the PP assumptions and involves some simpledifferential equations.
To begin with, let’s define Px(t) ≡ P (N(t) = x), i.e., the probability ofexactly x arrivals by time t.
Note that the probability that there haven’t been any arrivals by time t+ h canbe written in terms of the probability that there haven’t been any arrivals bytime t. . . .
ISYE 6739 — Goldsman 8/5/20 26 / 108
![Page 139: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/139.jpg)
Poisson Distribution
Definition: Let X be the number of occurrences in a Poisson(λ) process in aunit interval of time. Then X has the Poisson distribution with parameterλ.
Notation: X ∼ Pois(λ).
Theorem/Definition: X ∼ Pois(λ)⇒ P (X = k) = e−λλk/k!,k = 0, 1, 2, . . ..
Proof: The proof follows from the PP assumptions and involves some simpledifferential equations.
To begin with, let’s define Px(t) ≡ P (N(t) = x), i.e., the probability ofexactly x arrivals by time t.
Note that the probability that there haven’t been any arrivals by time t+ h canbe written in terms of the probability that there haven’t been any arrivals bytime t. . . .
ISYE 6739 — Goldsman 8/5/20 26 / 108
![Page 140: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/140.jpg)
Poisson Distribution
Definition: Let X be the number of occurrences in a Poisson(λ) process in aunit interval of time. Then X has the Poisson distribution with parameterλ.
Notation: X ∼ Pois(λ).
Theorem/Definition: X ∼ Pois(λ)⇒ P (X = k) = e−λλk/k!,k = 0, 1, 2, . . ..
Proof: The proof follows from the PP assumptions and involves some simpledifferential equations.
To begin with, let’s define Px(t) ≡ P (N(t) = x), i.e., the probability ofexactly x arrivals by time t.
Note that the probability that there haven’t been any arrivals by time t+ h canbe written in terms of the probability that there haven’t been any arrivals bytime t. . . .
ISYE 6739 — Goldsman 8/5/20 26 / 108
![Page 141: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/141.jpg)
Poisson Distribution
Definition: Let X be the number of occurrences in a Poisson(λ) process in aunit interval of time. Then X has the Poisson distribution with parameterλ.
Notation: X ∼ Pois(λ).
Theorem/Definition: X ∼ Pois(λ)⇒ P (X = k) = e−λλk/k!,k = 0, 1, 2, . . ..
Proof: The proof follows from the PP assumptions and involves some simpledifferential equations.
To begin with, let’s define Px(t) ≡ P (N(t) = x), i.e., the probability ofexactly x arrivals by time t.
Note that the probability that there haven’t been any arrivals by time t+ h canbe written in terms of the probability that there haven’t been any arrivals bytime t. . . .
ISYE 6739 — Goldsman 8/5/20 26 / 108
![Page 142: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/142.jpg)
Poisson Distribution
Definition: Let X be the number of occurrences in a Poisson(λ) process in aunit interval of time. Then X has the Poisson distribution with parameterλ.
Notation: X ∼ Pois(λ).
Theorem/Definition: X ∼ Pois(λ)⇒ P (X = k) = e−λλk/k!,k = 0, 1, 2, . . ..
Proof: The proof follows from the PP assumptions and involves some simpledifferential equations.
To begin with, let’s define Px(t) ≡ P (N(t) = x), i.e., the probability ofexactly x arrivals by time t.
Note that the probability that there haven’t been any arrivals by time t+ h canbe written in terms of the probability that there haven’t been any arrivals bytime t. . . .
ISYE 6739 — Goldsman 8/5/20 26 / 108
![Page 143: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/143.jpg)
Poisson Distribution
Definition: Let X be the number of occurrences in a Poisson(λ) process in aunit interval of time. Then X has the Poisson distribution with parameterλ.
Notation: X ∼ Pois(λ).
Theorem/Definition: X ∼ Pois(λ)⇒ P (X = k) = e−λλk/k!,k = 0, 1, 2, . . ..
Proof: The proof follows from the PP assumptions and involves some simpledifferential equations.
To begin with, let’s define Px(t) ≡ P (N(t) = x), i.e., the probability ofexactly x arrivals by time t.
Note that the probability that there haven’t been any arrivals by time t+ h canbe written in terms of the probability that there haven’t been any arrivals bytime t. . . .
ISYE 6739 — Goldsman 8/5/20 26 / 108
![Page 144: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/144.jpg)
Poisson Distribution
Definition: Let X be the number of occurrences in a Poisson(λ) process in aunit interval of time. Then X has the Poisson distribution with parameterλ.
Notation: X ∼ Pois(λ).
Theorem/Definition: X ∼ Pois(λ)⇒ P (X = k) = e−λλk/k!,k = 0, 1, 2, . . ..
Proof: The proof follows from the PP assumptions and involves some simpledifferential equations.
To begin with, let’s define Px(t) ≡ P (N(t) = x), i.e., the probability ofexactly x arrivals by time t.
Note that the probability that there haven’t been any arrivals by time t+ h canbe written in terms of the probability that there haven’t been any arrivals bytime t. . . .
ISYE 6739 — Goldsman 8/5/20 26 / 108
![Page 145: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/145.jpg)
Poisson Distribution
P0(t+ h)
= P (N(t+ h) = 0)
= P (no arrivals by time t and then no arrivals by time t+ h)
= P({N(t) = 0} ∩ {N(t+ h)−N(t) = 0}
)= P (N(t) = 0)P (N(t+ h)−N(t) = 0) (by indep increments (iii)).= P (N(t) = 0)(1− λh) (by (i) and a little bit of (ii))
= P0(t)(1− λh).
Thus,P0(t+ h)− P0(t)
h
.= −λP0(t).
Taking the limit as h→ 0, we have
P ′0(t) = −λP0(t). (1)
ISYE 6739 — Goldsman 8/5/20 27 / 108
![Page 146: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/146.jpg)
Poisson Distribution
P0(t+ h)
= P (N(t+ h) = 0)
= P (no arrivals by time t and then no arrivals by time t+ h)
= P({N(t) = 0} ∩ {N(t+ h)−N(t) = 0}
)= P (N(t) = 0)P (N(t+ h)−N(t) = 0) (by indep increments (iii)).= P (N(t) = 0)(1− λh) (by (i) and a little bit of (ii))
= P0(t)(1− λh).
Thus,P0(t+ h)− P0(t)
h
.= −λP0(t).
Taking the limit as h→ 0, we have
P ′0(t) = −λP0(t). (1)
ISYE 6739 — Goldsman 8/5/20 27 / 108
![Page 147: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/147.jpg)
Poisson Distribution
P0(t+ h)
= P (N(t+ h) = 0)
= P (no arrivals by time t and then no arrivals by time t+ h)
= P({N(t) = 0} ∩ {N(t+ h)−N(t) = 0}
)
= P (N(t) = 0)P (N(t+ h)−N(t) = 0) (by indep increments (iii)).= P (N(t) = 0)(1− λh) (by (i) and a little bit of (ii))
= P0(t)(1− λh).
Thus,P0(t+ h)− P0(t)
h
.= −λP0(t).
Taking the limit as h→ 0, we have
P ′0(t) = −λP0(t). (1)
ISYE 6739 — Goldsman 8/5/20 27 / 108
![Page 148: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/148.jpg)
Poisson Distribution
P0(t+ h)
= P (N(t+ h) = 0)
= P (no arrivals by time t and then no arrivals by time t+ h)
= P({N(t) = 0} ∩ {N(t+ h)−N(t) = 0}
)= P (N(t) = 0)P (N(t+ h)−N(t) = 0) (by indep increments (iii))
.= P (N(t) = 0)(1− λh) (by (i) and a little bit of (ii))
= P0(t)(1− λh).
Thus,P0(t+ h)− P0(t)
h
.= −λP0(t).
Taking the limit as h→ 0, we have
P ′0(t) = −λP0(t). (1)
ISYE 6739 — Goldsman 8/5/20 27 / 108
![Page 149: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/149.jpg)
Poisson Distribution
P0(t+ h)
= P (N(t+ h) = 0)
= P (no arrivals by time t and then no arrivals by time t+ h)
= P({N(t) = 0} ∩ {N(t+ h)−N(t) = 0}
)= P (N(t) = 0)P (N(t+ h)−N(t) = 0) (by indep increments (iii)).= P (N(t) = 0)(1− λh) (by (i) and a little bit of (ii))
= P0(t)(1− λh).
Thus,P0(t+ h)− P0(t)
h
.= −λP0(t).
Taking the limit as h→ 0, we have
P ′0(t) = −λP0(t). (1)
ISYE 6739 — Goldsman 8/5/20 27 / 108
![Page 150: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/150.jpg)
Poisson Distribution
P0(t+ h)
= P (N(t+ h) = 0)
= P (no arrivals by time t and then no arrivals by time t+ h)
= P({N(t) = 0} ∩ {N(t+ h)−N(t) = 0}
)= P (N(t) = 0)P (N(t+ h)−N(t) = 0) (by indep increments (iii)).= P (N(t) = 0)(1− λh) (by (i) and a little bit of (ii))
= P0(t)(1− λh).
Thus,P0(t+ h)− P0(t)
h
.= −λP0(t).
Taking the limit as h→ 0, we have
P ′0(t) = −λP0(t). (1)
ISYE 6739 — Goldsman 8/5/20 27 / 108
![Page 151: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/151.jpg)
Poisson Distribution
P0(t+ h)
= P (N(t+ h) = 0)
= P (no arrivals by time t and then no arrivals by time t+ h)
= P({N(t) = 0} ∩ {N(t+ h)−N(t) = 0}
)= P (N(t) = 0)P (N(t+ h)−N(t) = 0) (by indep increments (iii)).= P (N(t) = 0)(1− λh) (by (i) and a little bit of (ii))
= P0(t)(1− λh).
Thus,P0(t+ h)− P0(t)
h
.= −λP0(t).
Taking the limit as h→ 0, we have
P ′0(t) = −λP0(t). (1)
ISYE 6739 — Goldsman 8/5/20 27 / 108
![Page 152: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/152.jpg)
Poisson Distribution
P0(t+ h)
= P (N(t+ h) = 0)
= P (no arrivals by time t and then no arrivals by time t+ h)
= P({N(t) = 0} ∩ {N(t+ h)−N(t) = 0}
)= P (N(t) = 0)P (N(t+ h)−N(t) = 0) (by indep increments (iii)).= P (N(t) = 0)(1− λh) (by (i) and a little bit of (ii))
= P0(t)(1− λh).
Thus,P0(t+ h)− P0(t)
h
.= −λP0(t).
Taking the limit as h→ 0, we have
P ′0(t) = −λP0(t). (1)
ISYE 6739 — Goldsman 8/5/20 27 / 108
![Page 153: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/153.jpg)
Poisson Distribution
P0(t+ h)
= P (N(t+ h) = 0)
= P (no arrivals by time t and then no arrivals by time t+ h)
= P({N(t) = 0} ∩ {N(t+ h)−N(t) = 0}
)= P (N(t) = 0)P (N(t+ h)−N(t) = 0) (by indep increments (iii)).= P (N(t) = 0)(1− λh) (by (i) and a little bit of (ii))
= P0(t)(1− λh).
Thus,P0(t+ h)− P0(t)
h
.= −λP0(t).
Taking the limit as h→ 0, we have
P ′0(t) = −λP0(t). (1)
ISYE 6739 — Goldsman 8/5/20 27 / 108
![Page 154: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/154.jpg)
Poisson Distribution
Similarly, for x > 0, we have
Px(t+ h) = P (N(t+ h) = x)
= P (N(t+ h) = x and no arrivals during [t, t+ h])
+P (N(t+ h) = x and ≥ 1 arrival during [t, t+ h])
(Law of Total Probability).= P
({N(t) = x} ∩ {N(t+ h)−N(t) = 0}
)+P({N(t) = x− 1} ∩ {N(t+ h)−N(t) = 1}
)(by (i), only consider case of one arrival in [t, t+ h])
= P (N(t) = x)P (N(t+ h)−N(t) = 0)
+P (N(t) = x− 1)P (N(t+ h)−N(t) = 1)
(by independent increments (iii)).= Px(t)(1− λh) + Px−1(t)λh.
ISYE 6739 — Goldsman 8/5/20 28 / 108
![Page 155: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/155.jpg)
Poisson Distribution
Similarly, for x > 0, we have
Px(t+ h) = P (N(t+ h) = x)
= P (N(t+ h) = x and no arrivals during [t, t+ h])
+P (N(t+ h) = x and ≥ 1 arrival during [t, t+ h])
(Law of Total Probability).= P
({N(t) = x} ∩ {N(t+ h)−N(t) = 0}
)+P({N(t) = x− 1} ∩ {N(t+ h)−N(t) = 1}
)(by (i), only consider case of one arrival in [t, t+ h])
= P (N(t) = x)P (N(t+ h)−N(t) = 0)
+P (N(t) = x− 1)P (N(t+ h)−N(t) = 1)
(by independent increments (iii)).= Px(t)(1− λh) + Px−1(t)λh.
ISYE 6739 — Goldsman 8/5/20 28 / 108
![Page 156: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/156.jpg)
Poisson Distribution
Similarly, for x > 0, we have
Px(t+ h) = P (N(t+ h) = x)
= P (N(t+ h) = x and no arrivals during [t, t+ h])
+P (N(t+ h) = x and ≥ 1 arrival during [t, t+ h])
(Law of Total Probability)
.= P
({N(t) = x} ∩ {N(t+ h)−N(t) = 0}
)+P({N(t) = x− 1} ∩ {N(t+ h)−N(t) = 1}
)(by (i), only consider case of one arrival in [t, t+ h])
= P (N(t) = x)P (N(t+ h)−N(t) = 0)
+P (N(t) = x− 1)P (N(t+ h)−N(t) = 1)
(by independent increments (iii)).= Px(t)(1− λh) + Px−1(t)λh.
ISYE 6739 — Goldsman 8/5/20 28 / 108
![Page 157: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/157.jpg)
Poisson Distribution
Similarly, for x > 0, we have
Px(t+ h) = P (N(t+ h) = x)
= P (N(t+ h) = x and no arrivals during [t, t+ h])
+P (N(t+ h) = x and ≥ 1 arrival during [t, t+ h])
(Law of Total Probability).= P
({N(t) = x} ∩ {N(t+ h)−N(t) = 0}
)+P({N(t) = x− 1} ∩ {N(t+ h)−N(t) = 1}
)(by (i), only consider case of one arrival in [t, t+ h])
= P (N(t) = x)P (N(t+ h)−N(t) = 0)
+P (N(t) = x− 1)P (N(t+ h)−N(t) = 1)
(by independent increments (iii)).= Px(t)(1− λh) + Px−1(t)λh.
ISYE 6739 — Goldsman 8/5/20 28 / 108
![Page 158: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/158.jpg)
Poisson Distribution
Similarly, for x > 0, we have
Px(t+ h) = P (N(t+ h) = x)
= P (N(t+ h) = x and no arrivals during [t, t+ h])
+P (N(t+ h) = x and ≥ 1 arrival during [t, t+ h])
(Law of Total Probability).= P
({N(t) = x} ∩ {N(t+ h)−N(t) = 0}
)+P({N(t) = x− 1} ∩ {N(t+ h)−N(t) = 1}
)(by (i), only consider case of one arrival in [t, t+ h])
= P (N(t) = x)P (N(t+ h)−N(t) = 0)
+P (N(t) = x− 1)P (N(t+ h)−N(t) = 1)
(by independent increments (iii))
.= Px(t)(1− λh) + Px−1(t)λh.
ISYE 6739 — Goldsman 8/5/20 28 / 108
![Page 159: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/159.jpg)
Poisson Distribution
Similarly, for x > 0, we have
Px(t+ h) = P (N(t+ h) = x)
= P (N(t+ h) = x and no arrivals during [t, t+ h])
+P (N(t+ h) = x and ≥ 1 arrival during [t, t+ h])
(Law of Total Probability).= P
({N(t) = x} ∩ {N(t+ h)−N(t) = 0}
)+P({N(t) = x− 1} ∩ {N(t+ h)−N(t) = 1}
)(by (i), only consider case of one arrival in [t, t+ h])
= P (N(t) = x)P (N(t+ h)−N(t) = 0)
+P (N(t) = x− 1)P (N(t+ h)−N(t) = 1)
(by independent increments (iii)).= Px(t)(1− λh) + Px−1(t)λh.
ISYE 6739 — Goldsman 8/5/20 28 / 108
![Page 160: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/160.jpg)
Poisson Distribution
Taking the limits as h→ 0, we obtain
P ′x(t) = λ[Px−1(t)− Px(t)
], x = 1, 2, . . . . (2)
The solution of differential equations (1) and (2) is easily shown to be
Px(t) =(λt)xe−λt
x!, x = 0, 1, 2, . . . .
(If you don’t believe me, just plug in and see for yourself!)
Noting that t = 1 for the Pois(λ) finally completes the proof. 2
Remark: λ can be changed simply by changing the units of time.
Examples:X = # calls to a switchboard in 1 minute ∼ Pois(3 / min)Y = # calls to a switchboard in 5 minutes ∼ Pois(15 / 5 min)Z = # calls to a switchboard in 10 sec ∼ Pois(0.5 / 10 sec)
ISYE 6739 — Goldsman 8/5/20 29 / 108
![Page 161: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/161.jpg)
Poisson Distribution
Taking the limits as h→ 0, we obtain
P ′x(t) = λ[Px−1(t)− Px(t)
], x = 1, 2, . . . . (2)
The solution of differential equations (1) and (2) is easily shown to be
Px(t) =(λt)xe−λt
x!, x = 0, 1, 2, . . . .
(If you don’t believe me, just plug in and see for yourself!)
Noting that t = 1 for the Pois(λ) finally completes the proof. 2
Remark: λ can be changed simply by changing the units of time.
Examples:X = # calls to a switchboard in 1 minute ∼ Pois(3 / min)Y = # calls to a switchboard in 5 minutes ∼ Pois(15 / 5 min)Z = # calls to a switchboard in 10 sec ∼ Pois(0.5 / 10 sec)
ISYE 6739 — Goldsman 8/5/20 29 / 108
![Page 162: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/162.jpg)
Poisson Distribution
Taking the limits as h→ 0, we obtain
P ′x(t) = λ[Px−1(t)− Px(t)
], x = 1, 2, . . . . (2)
The solution of differential equations (1) and (2) is easily shown to be
Px(t) =(λt)xe−λt
x!, x = 0, 1, 2, . . . .
(If you don’t believe me, just plug in and see for yourself!)
Noting that t = 1 for the Pois(λ) finally completes the proof. 2
Remark: λ can be changed simply by changing the units of time.
Examples:X = # calls to a switchboard in 1 minute ∼ Pois(3 / min)Y = # calls to a switchboard in 5 minutes ∼ Pois(15 / 5 min)Z = # calls to a switchboard in 10 sec ∼ Pois(0.5 / 10 sec)
ISYE 6739 — Goldsman 8/5/20 29 / 108
![Page 163: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/163.jpg)
Poisson Distribution
Taking the limits as h→ 0, we obtain
P ′x(t) = λ[Px−1(t)− Px(t)
], x = 1, 2, . . . . (2)
The solution of differential equations (1) and (2) is easily shown to be
Px(t) =(λt)xe−λt
x!, x = 0, 1, 2, . . . .
(If you don’t believe me, just plug in and see for yourself!)
Noting that t = 1 for the Pois(λ) finally completes the proof. 2
Remark: λ can be changed simply by changing the units of time.
Examples:X = # calls to a switchboard in 1 minute ∼ Pois(3 / min)Y = # calls to a switchboard in 5 minutes ∼ Pois(15 / 5 min)Z = # calls to a switchboard in 10 sec ∼ Pois(0.5 / 10 sec)
ISYE 6739 — Goldsman 8/5/20 29 / 108
![Page 164: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/164.jpg)
Poisson Distribution
Taking the limits as h→ 0, we obtain
P ′x(t) = λ[Px−1(t)− Px(t)
], x = 1, 2, . . . . (2)
The solution of differential equations (1) and (2) is easily shown to be
Px(t) =(λt)xe−λt
x!, x = 0, 1, 2, . . . .
(If you don’t believe me, just plug in and see for yourself!)
Noting that t = 1 for the Pois(λ) finally completes the proof. 2
Remark: λ can be changed simply by changing the units of time.
Examples:X = # calls to a switchboard in 1 minute ∼ Pois(3 / min)Y = # calls to a switchboard in 5 minutes ∼ Pois(15 / 5 min)Z = # calls to a switchboard in 10 sec ∼ Pois(0.5 / 10 sec)
ISYE 6739 — Goldsman 8/5/20 29 / 108
![Page 165: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/165.jpg)
Poisson Distribution
Taking the limits as h→ 0, we obtain
P ′x(t) = λ[Px−1(t)− Px(t)
], x = 1, 2, . . . . (2)
The solution of differential equations (1) and (2) is easily shown to be
Px(t) =(λt)xe−λt
x!, x = 0, 1, 2, . . . .
(If you don’t believe me, just plug in and see for yourself!)
Noting that t = 1 for the Pois(λ) finally completes the proof. 2
Remark: λ can be changed simply by changing the units of time.
Examples:X = # calls to a switchboard in 1 minute ∼ Pois(3 / min)Y = # calls to a switchboard in 5 minutes ∼ Pois(15 / 5 min)Z = # calls to a switchboard in 10 sec ∼ Pois(0.5 / 10 sec)
ISYE 6739 — Goldsman 8/5/20 29 / 108
![Page 166: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/166.jpg)
Poisson Distribution
Taking the limits as h→ 0, we obtain
P ′x(t) = λ[Px−1(t)− Px(t)
], x = 1, 2, . . . . (2)
The solution of differential equations (1) and (2) is easily shown to be
Px(t) =(λt)xe−λt
x!, x = 0, 1, 2, . . . .
(If you don’t believe me, just plug in and see for yourself!)
Noting that t = 1 for the Pois(λ) finally completes the proof. 2
Remark: λ can be changed simply by changing the units of time.
Examples:X = # calls to a switchboard in 1 minute ∼ Pois(3 / min)
Y = # calls to a switchboard in 5 minutes ∼ Pois(15 / 5 min)Z = # calls to a switchboard in 10 sec ∼ Pois(0.5 / 10 sec)
ISYE 6739 — Goldsman 8/5/20 29 / 108
![Page 167: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/167.jpg)
Poisson Distribution
Taking the limits as h→ 0, we obtain
P ′x(t) = λ[Px−1(t)− Px(t)
], x = 1, 2, . . . . (2)
The solution of differential equations (1) and (2) is easily shown to be
Px(t) =(λt)xe−λt
x!, x = 0, 1, 2, . . . .
(If you don’t believe me, just plug in and see for yourself!)
Noting that t = 1 for the Pois(λ) finally completes the proof. 2
Remark: λ can be changed simply by changing the units of time.
Examples:X = # calls to a switchboard in 1 minute ∼ Pois(3 / min)Y = # calls to a switchboard in 5 minutes ∼ Pois(15 / 5 min)
Z = # calls to a switchboard in 10 sec ∼ Pois(0.5 / 10 sec)
ISYE 6739 — Goldsman 8/5/20 29 / 108
![Page 168: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/168.jpg)
Poisson Distribution
Taking the limits as h→ 0, we obtain
P ′x(t) = λ[Px−1(t)− Px(t)
], x = 1, 2, . . . . (2)
The solution of differential equations (1) and (2) is easily shown to be
Px(t) =(λt)xe−λt
x!, x = 0, 1, 2, . . . .
(If you don’t believe me, just plug in and see for yourself!)
Noting that t = 1 for the Pois(λ) finally completes the proof. 2
Remark: λ can be changed simply by changing the units of time.
Examples:X = # calls to a switchboard in 1 minute ∼ Pois(3 / min)Y = # calls to a switchboard in 5 minutes ∼ Pois(15 / 5 min)Z = # calls to a switchboard in 10 sec ∼ Pois(0.5 / 10 sec)
ISYE 6739 — Goldsman 8/5/20 29 / 108
![Page 169: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/169.jpg)
Poisson Distribution
Theorem: X ∼ Pois(λ)⇒ mgf is MX(t) = eλ(et−1).
Proof:
MX(t) = E[etX ] =∞∑k=0
etk(e−λλk
k!
)= e−λ
∞∑k=0
(λet)k
k!= e−λeλe
t. 2
Theorem: X ∼ Pois(λ)⇒ E[X] = Var(X) = λ.
Proof (using mgf):
E[X] =d
dtMX(t)
∣∣∣t=0
=d
dteλ(e
t−1)∣∣∣t=0
= λetMX(t)∣∣∣t=0
= λ.
ISYE 6739 — Goldsman 8/5/20 30 / 108
![Page 170: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/170.jpg)
Poisson Distribution
Theorem: X ∼ Pois(λ)⇒ mgf is MX(t) = eλ(et−1).
Proof:
MX(t) = E[etX ]
=∞∑k=0
etk(e−λλk
k!
)= e−λ
∞∑k=0
(λet)k
k!= e−λeλe
t. 2
Theorem: X ∼ Pois(λ)⇒ E[X] = Var(X) = λ.
Proof (using mgf):
E[X] =d
dtMX(t)
∣∣∣t=0
=d
dteλ(e
t−1)∣∣∣t=0
= λetMX(t)∣∣∣t=0
= λ.
ISYE 6739 — Goldsman 8/5/20 30 / 108
![Page 171: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/171.jpg)
Poisson Distribution
Theorem: X ∼ Pois(λ)⇒ mgf is MX(t) = eλ(et−1).
Proof:
MX(t) = E[etX ] =
∞∑k=0
etk(e−λλk
k!
)
= e−λ∞∑k=0
(λet)k
k!= e−λeλe
t. 2
Theorem: X ∼ Pois(λ)⇒ E[X] = Var(X) = λ.
Proof (using mgf):
E[X] =d
dtMX(t)
∣∣∣t=0
=d
dteλ(e
t−1)∣∣∣t=0
= λetMX(t)∣∣∣t=0
= λ.
ISYE 6739 — Goldsman 8/5/20 30 / 108
![Page 172: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/172.jpg)
Poisson Distribution
Theorem: X ∼ Pois(λ)⇒ mgf is MX(t) = eλ(et−1).
Proof:
MX(t) = E[etX ] =
∞∑k=0
etk(e−λλk
k!
)= e−λ
∞∑k=0
(λet)k
k!
= e−λeλet. 2
Theorem: X ∼ Pois(λ)⇒ E[X] = Var(X) = λ.
Proof (using mgf):
E[X] =d
dtMX(t)
∣∣∣t=0
=d
dteλ(e
t−1)∣∣∣t=0
= λetMX(t)∣∣∣t=0
= λ.
ISYE 6739 — Goldsman 8/5/20 30 / 108
![Page 173: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/173.jpg)
Poisson Distribution
Theorem: X ∼ Pois(λ)⇒ mgf is MX(t) = eλ(et−1).
Proof:
MX(t) = E[etX ] =
∞∑k=0
etk(e−λλk
k!
)= e−λ
∞∑k=0
(λet)k
k!= e−λeλe
t. 2
Theorem: X ∼ Pois(λ)⇒ E[X] = Var(X) = λ.
Proof (using mgf):
E[X] =d
dtMX(t)
∣∣∣t=0
=d
dteλ(e
t−1)∣∣∣t=0
= λetMX(t)∣∣∣t=0
= λ.
ISYE 6739 — Goldsman 8/5/20 30 / 108
![Page 174: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/174.jpg)
Poisson Distribution
Theorem: X ∼ Pois(λ)⇒ mgf is MX(t) = eλ(et−1).
Proof:
MX(t) = E[etX ] =
∞∑k=0
etk(e−λλk
k!
)= e−λ
∞∑k=0
(λet)k
k!= e−λeλe
t. 2
Theorem: X ∼ Pois(λ)⇒ E[X] = Var(X) = λ.
Proof (using mgf):
E[X] =d
dtMX(t)
∣∣∣t=0
=d
dteλ(e
t−1)∣∣∣t=0
= λetMX(t)∣∣∣t=0
= λ.
ISYE 6739 — Goldsman 8/5/20 30 / 108
![Page 175: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/175.jpg)
Poisson Distribution
Theorem: X ∼ Pois(λ)⇒ mgf is MX(t) = eλ(et−1).
Proof:
MX(t) = E[etX ] =
∞∑k=0
etk(e−λλk
k!
)= e−λ
∞∑k=0
(λet)k
k!= e−λeλe
t. 2
Theorem: X ∼ Pois(λ)⇒ E[X] = Var(X) = λ.
Proof (using mgf):
E[X] =d
dtMX(t)
∣∣∣t=0
=d
dteλ(e
t−1)∣∣∣t=0
= λetMX(t)∣∣∣t=0
= λ.
ISYE 6739 — Goldsman 8/5/20 30 / 108
![Page 176: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/176.jpg)
Poisson Distribution
Theorem: X ∼ Pois(λ)⇒ mgf is MX(t) = eλ(et−1).
Proof:
MX(t) = E[etX ] =
∞∑k=0
etk(e−λλk
k!
)= e−λ
∞∑k=0
(λet)k
k!= e−λeλe
t. 2
Theorem: X ∼ Pois(λ)⇒ E[X] = Var(X) = λ.
Proof (using mgf):
E[X] =d
dtMX(t)
∣∣∣t=0
=d
dteλ(e
t−1)∣∣∣t=0
= λetMX(t)∣∣∣t=0
= λ.
ISYE 6739 — Goldsman 8/5/20 30 / 108
![Page 177: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/177.jpg)
Poisson Distribution
Theorem: X ∼ Pois(λ)⇒ mgf is MX(t) = eλ(et−1).
Proof:
MX(t) = E[etX ] =
∞∑k=0
etk(e−λλk
k!
)= e−λ
∞∑k=0
(λet)k
k!= e−λeλe
t. 2
Theorem: X ∼ Pois(λ)⇒ E[X] = Var(X) = λ.
Proof (using mgf):
E[X] =d
dtMX(t)
∣∣∣t=0
=d
dteλ(e
t−1)∣∣∣t=0
= λetMX(t)∣∣∣t=0
= λ.
ISYE 6739 — Goldsman 8/5/20 30 / 108
![Page 178: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/178.jpg)
Poisson Distribution
Theorem: X ∼ Pois(λ)⇒ mgf is MX(t) = eλ(et−1).
Proof:
MX(t) = E[etX ] =
∞∑k=0
etk(e−λλk
k!
)= e−λ
∞∑k=0
(λet)k
k!= e−λeλe
t. 2
Theorem: X ∼ Pois(λ)⇒ E[X] = Var(X) = λ.
Proof (using mgf):
E[X] =d
dtMX(t)
∣∣∣t=0
=d
dteλ(e
t−1)∣∣∣t=0
= λetMX(t)∣∣∣t=0
= λ.
ISYE 6739 — Goldsman 8/5/20 30 / 108
![Page 179: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/179.jpg)
Poisson Distribution
Similarly,
E[X2] =d2
dt2MX(t)
∣∣∣t=0
=d
dt
( ddtMX(t)
)∣∣∣t=0
= λd
dt
(etMX(t)
)∣∣∣t=0
= λ[etMX(t) + et
d
dtMX(t)
]∣∣∣t=0
= λet[MX(t) + λetMX(t)
]∣∣∣t=0
= λ(1 + λ).
Thus, Var(X) = E[X2]− (E[X])2 = λ(1 + λ)− λ2 = λ. 2
ISYE 6739 — Goldsman 8/5/20 31 / 108
![Page 180: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/180.jpg)
Poisson Distribution
Similarly,
E[X2] =d2
dt2MX(t)
∣∣∣t=0
=d
dt
( ddtMX(t)
)∣∣∣t=0
= λd
dt
(etMX(t)
)∣∣∣t=0
= λ[etMX(t) + et
d
dtMX(t)
]∣∣∣t=0
= λet[MX(t) + λetMX(t)
]∣∣∣t=0
= λ(1 + λ).
Thus, Var(X) = E[X2]− (E[X])2 = λ(1 + λ)− λ2 = λ. 2
ISYE 6739 — Goldsman 8/5/20 31 / 108
![Page 181: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/181.jpg)
Poisson Distribution
Similarly,
E[X2] =d2
dt2MX(t)
∣∣∣t=0
=d
dt
( ddtMX(t)
)∣∣∣t=0
= λd
dt
(etMX(t)
)∣∣∣t=0
= λ[etMX(t) + et
d
dtMX(t)
]∣∣∣t=0
= λet[MX(t) + λetMX(t)
]∣∣∣t=0
= λ(1 + λ).
Thus, Var(X) = E[X2]− (E[X])2 = λ(1 + λ)− λ2 = λ. 2
ISYE 6739 — Goldsman 8/5/20 31 / 108
![Page 182: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/182.jpg)
Poisson Distribution
Similarly,
E[X2] =d2
dt2MX(t)
∣∣∣t=0
=d
dt
( ddtMX(t)
)∣∣∣t=0
= λd
dt
(etMX(t)
)∣∣∣t=0
= λ[etMX(t) + et
d
dtMX(t)
]∣∣∣t=0
= λet[MX(t) + λetMX(t)
]∣∣∣t=0
= λ(1 + λ).
Thus, Var(X) = E[X2]− (E[X])2 = λ(1 + λ)− λ2 = λ. 2
ISYE 6739 — Goldsman 8/5/20 31 / 108
![Page 183: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/183.jpg)
Poisson Distribution
Similarly,
E[X2] =d2
dt2MX(t)
∣∣∣t=0
=d
dt
( ddtMX(t)
)∣∣∣t=0
= λd
dt
(etMX(t)
)∣∣∣t=0
= λ[etMX(t) + et
d
dtMX(t)
]∣∣∣t=0
= λet[MX(t) + λetMX(t)
]∣∣∣t=0
= λ(1 + λ).
Thus, Var(X) = E[X2]− (E[X])2 = λ(1 + λ)− λ2 = λ. 2
ISYE 6739 — Goldsman 8/5/20 31 / 108
![Page 184: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/184.jpg)
Poisson Distribution
Similarly,
E[X2] =d2
dt2MX(t)
∣∣∣t=0
=d
dt
( ddtMX(t)
)∣∣∣t=0
= λd
dt
(etMX(t)
)∣∣∣t=0
= λ[etMX(t) + et
d
dtMX(t)
]∣∣∣t=0
= λet[MX(t) + λetMX(t)
]∣∣∣t=0
= λ(1 + λ).
Thus, Var(X) = E[X2]− (E[X])2 = λ(1 + λ)− λ2 = λ. 2
ISYE 6739 — Goldsman 8/5/20 31 / 108
![Page 185: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/185.jpg)
Poisson Distribution
Similarly,
E[X2] =d2
dt2MX(t)
∣∣∣t=0
=d
dt
( ddtMX(t)
)∣∣∣t=0
= λd
dt
(etMX(t)
)∣∣∣t=0
= λ[etMX(t) + et
d
dtMX(t)
]∣∣∣t=0
= λet[MX(t) + λetMX(t)
]∣∣∣t=0
= λ(1 + λ).
Thus, Var(X) = E[X2]− (E[X])2 = λ(1 + λ)− λ2 = λ. 2
ISYE 6739 — Goldsman 8/5/20 31 / 108
![Page 186: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/186.jpg)
Poisson Distribution
Example: Calls to a switchboard arrive as a Poisson process with rate 3calls/min.
Let X = number of calls in 1 minute.
So X ∼ Pois(3), E[X] = Var(X) = 3, P (X ≤ 4) =∑4
k=0 e−33k/k!.
Let Y = number of calls in 40 sec.
So Y ∼ Pois(2), E[Y ] = Var(Y ) = 2, P (Y ≤ 4) =∑4
k=0 e−22k/k!. 2
ISYE 6739 — Goldsman 8/5/20 32 / 108
![Page 187: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/187.jpg)
Poisson Distribution
Example: Calls to a switchboard arrive as a Poisson process with rate 3calls/min.
Let X = number of calls in 1 minute.
So X ∼ Pois(3), E[X] = Var(X) = 3, P (X ≤ 4) =∑4
k=0 e−33k/k!.
Let Y = number of calls in 40 sec.
So Y ∼ Pois(2), E[Y ] = Var(Y ) = 2, P (Y ≤ 4) =∑4
k=0 e−22k/k!. 2
ISYE 6739 — Goldsman 8/5/20 32 / 108
![Page 188: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/188.jpg)
Poisson Distribution
Example: Calls to a switchboard arrive as a Poisson process with rate 3calls/min.
Let X = number of calls in 1 minute.
So X ∼ Pois(3), E[X] = Var(X) = 3, P (X ≤ 4) =∑4
k=0 e−33k/k!.
Let Y = number of calls in 40 sec.
So Y ∼ Pois(2), E[Y ] = Var(Y ) = 2, P (Y ≤ 4) =∑4
k=0 e−22k/k!. 2
ISYE 6739 — Goldsman 8/5/20 32 / 108
![Page 189: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/189.jpg)
Poisson Distribution
Example: Calls to a switchboard arrive as a Poisson process with rate 3calls/min.
Let X = number of calls in 1 minute.
So X ∼ Pois(3), E[X] = Var(X) = 3, P (X ≤ 4) =∑4
k=0 e−33k/k!.
Let Y = number of calls in 40 sec.
So Y ∼ Pois(2), E[Y ] = Var(Y ) = 2, P (Y ≤ 4) =∑4
k=0 e−22k/k!. 2
ISYE 6739 — Goldsman 8/5/20 32 / 108
![Page 190: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/190.jpg)
Poisson Distribution
Example: Calls to a switchboard arrive as a Poisson process with rate 3calls/min.
Let X = number of calls in 1 minute.
So X ∼ Pois(3), E[X] = Var(X) = 3, P (X ≤ 4) =∑4
k=0 e−33k/k!.
Let Y = number of calls in 40 sec.
So Y ∼ Pois(2), E[Y ] = Var(Y ) = 2, P (Y ≤ 4) =∑4
k=0 e−22k/k!. 2
ISYE 6739 — Goldsman 8/5/20 32 / 108
![Page 191: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/191.jpg)
Poisson Distribution
Theorem (Additive Property of Poissons): Suppose X1, . . . , Xn areindependent with Xi ∼ Pois(λi), i = 1, . . . , n. Then
Y ≡n∑i=1
Xi ∼ Pois( n∑i=1
λi
).
Proof: Since the Xi’s are independent, we have
MY (t) =
n∏i=1
MXi(t) =
n∏i=1
eλi(et−1) = e(
∑ni=1 λi)(e
t−1),
which is the mgf of the Pois(∑n
i=1 λi)
distribution. 2
ISYE 6739 — Goldsman 8/5/20 33 / 108
![Page 192: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/192.jpg)
Poisson Distribution
Theorem (Additive Property of Poissons): Suppose X1, . . . , Xn areindependent with Xi ∼ Pois(λi), i = 1, . . . , n. Then
Y ≡n∑i=1
Xi ∼ Pois( n∑i=1
λi
).
Proof: Since the Xi’s are independent, we have
MY (t) =
n∏i=1
MXi(t) =
n∏i=1
eλi(et−1) = e(
∑ni=1 λi)(e
t−1),
which is the mgf of the Pois(∑n
i=1 λi)
distribution. 2
ISYE 6739 — Goldsman 8/5/20 33 / 108
![Page 193: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/193.jpg)
Poisson Distribution
Theorem (Additive Property of Poissons): Suppose X1, . . . , Xn areindependent with Xi ∼ Pois(λi), i = 1, . . . , n. Then
Y ≡n∑i=1
Xi ∼ Pois( n∑i=1
λi
).
Proof: Since the Xi’s are independent, we have
MY (t) =
n∏i=1
MXi(t) =
n∏i=1
eλi(et−1) = e(
∑ni=1 λi)(e
t−1),
which is the mgf of the Pois(∑n
i=1 λi)
distribution. 2
ISYE 6739 — Goldsman 8/5/20 33 / 108
![Page 194: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/194.jpg)
Poisson Distribution
Theorem (Additive Property of Poissons): Suppose X1, . . . , Xn areindependent with Xi ∼ Pois(λi), i = 1, . . . , n. Then
Y ≡n∑i=1
Xi ∼ Pois( n∑i=1
λi
).
Proof: Since the Xi’s are independent, we have
MY (t) =
n∏i=1
MXi(t)
=
n∏i=1
eλi(et−1) = e(
∑ni=1 λi)(e
t−1),
which is the mgf of the Pois(∑n
i=1 λi)
distribution. 2
ISYE 6739 — Goldsman 8/5/20 33 / 108
![Page 195: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/195.jpg)
Poisson Distribution
Theorem (Additive Property of Poissons): Suppose X1, . . . , Xn areindependent with Xi ∼ Pois(λi), i = 1, . . . , n. Then
Y ≡n∑i=1
Xi ∼ Pois( n∑i=1
λi
).
Proof: Since the Xi’s are independent, we have
MY (t) =
n∏i=1
MXi(t) =
n∏i=1
eλi(et−1)
= e(∑n
i=1 λi)(et−1),
which is the mgf of the Pois(∑n
i=1 λi)
distribution. 2
ISYE 6739 — Goldsman 8/5/20 33 / 108
![Page 196: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/196.jpg)
Poisson Distribution
Theorem (Additive Property of Poissons): Suppose X1, . . . , Xn areindependent with Xi ∼ Pois(λi), i = 1, . . . , n. Then
Y ≡n∑i=1
Xi ∼ Pois( n∑i=1
λi
).
Proof: Since the Xi’s are independent, we have
MY (t) =
n∏i=1
MXi(t) =
n∏i=1
eλi(et−1) = e(
∑ni=1 λi)(e
t−1),
which is the mgf of the Pois(∑n
i=1 λi)
distribution. 2
ISYE 6739 — Goldsman 8/5/20 33 / 108
![Page 197: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/197.jpg)
Poisson Distribution
Theorem (Additive Property of Poissons): Suppose X1, . . . , Xn areindependent with Xi ∼ Pois(λi), i = 1, . . . , n. Then
Y ≡n∑i=1
Xi ∼ Pois( n∑i=1
λi
).
Proof: Since the Xi’s are independent, we have
MY (t) =
n∏i=1
MXi(t) =
n∏i=1
eλi(et−1) = e(
∑ni=1 λi)(e
t−1),
which is the mgf of the Pois(∑n
i=1 λi)
distribution. 2
ISYE 6739 — Goldsman 8/5/20 33 / 108
![Page 198: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/198.jpg)
Poisson Distribution
Example: Cars driven by males [females] arrive at a parking lot according toa Poisson process with a rate of λ1 = 3 / hr [λ2 = 5 / hr]. All arrivals areindependent.
What’s the probability of exactly 2 arrivals in the next 30 minutes?
The total number of arrivals is Pois(λ1 + λ2 = 8/hr), and so the total in thenext 30 minutes is X ∼ Pois(4). So P (X = 2) = e−442/2!. 2
ISYE 6739 — Goldsman 8/5/20 34 / 108
![Page 199: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/199.jpg)
Poisson Distribution
Example: Cars driven by males [females] arrive at a parking lot according toa Poisson process with a rate of λ1 = 3 / hr [λ2 = 5 / hr]. All arrivals areindependent.
What’s the probability of exactly 2 arrivals in the next 30 minutes?
The total number of arrivals is Pois(λ1 + λ2 = 8/hr), and so the total in thenext 30 minutes is X ∼ Pois(4). So P (X = 2) = e−442/2!. 2
ISYE 6739 — Goldsman 8/5/20 34 / 108
![Page 200: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/200.jpg)
Poisson Distribution
Example: Cars driven by males [females] arrive at a parking lot according toa Poisson process with a rate of λ1 = 3 / hr [λ2 = 5 / hr]. All arrivals areindependent.
What’s the probability of exactly 2 arrivals in the next 30 minutes?
The total number of arrivals is Pois(λ1 + λ2 = 8/hr), and so the total in thenext 30 minutes is X ∼ Pois(4).
So P (X = 2) = e−442/2!. 2
ISYE 6739 — Goldsman 8/5/20 34 / 108
![Page 201: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/201.jpg)
Poisson Distribution
Example: Cars driven by males [females] arrive at a parking lot according toa Poisson process with a rate of λ1 = 3 / hr [λ2 = 5 / hr]. All arrivals areindependent.
What’s the probability of exactly 2 arrivals in the next 30 minutes?
The total number of arrivals is Pois(λ1 + λ2 = 8/hr), and so the total in thenext 30 minutes is X ∼ Pois(4). So P (X = 2) = e−442/2!. 2
ISYE 6739 — Goldsman 8/5/20 34 / 108
![Page 202: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/202.jpg)
Uniform, Exponential, and Friends
1 Bernoulli and Binomial Distributions
2 Hypergeometric Distribution3 Geometric and Negative Binomial Distributions
4 Poisson Distribution5 Uniform, Exponential, and Friends6 Other Continuous Distributions7 Normal Distribution: Basics8 Standard Normal Distribution9 Sample Mean of Normals
10 The Central Limit Theorem + Proof
11 Central Limit Theorem Examples
12 Extensions — Multivariate Normal Distribution13 Extensions — Lognormal Distribution
14 Computer Stuff
ISYE 6739 — Goldsman 8/5/20 35 / 108
![Page 203: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/203.jpg)
Uniform, Exponential, and Friends
Lesson 4.5 — Uniform, Exponential, and Friends
Definition: The RV X has the Uniform distribution if it has pdf and cdf
f(x) =
{1b−a , a ≤ x ≤ b0, otherwise
and F (x) =
0, x < axb−a , a ≤ x ≤ b1, x ≥ b.
Notation: X ∼ Unif(a, b).
Previous work showed that
E[X] =a+ b
2and Var(X) =
(a− b)2
12.
We can also derive the mgf,
MX(t) = E[etX ] =
∫ b
aetx
1
b− adx =
etb − eta
t(b− a).
ISYE 6739 — Goldsman 8/5/20 36 / 108
![Page 204: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/204.jpg)
Uniform, Exponential, and Friends
Lesson 4.5 — Uniform, Exponential, and Friends
Definition: The RV X has the Uniform distribution if it has pdf and cdf
f(x) =
{1b−a , a ≤ x ≤ b0, otherwise
and F (x) =
0, x < axb−a , a ≤ x ≤ b1, x ≥ b.
Notation: X ∼ Unif(a, b).
Previous work showed that
E[X] =a+ b
2and Var(X) =
(a− b)2
12.
We can also derive the mgf,
MX(t) = E[etX ] =
∫ b
aetx
1
b− adx =
etb − eta
t(b− a).
ISYE 6739 — Goldsman 8/5/20 36 / 108
![Page 205: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/205.jpg)
Uniform, Exponential, and Friends
Lesson 4.5 — Uniform, Exponential, and Friends
Definition: The RV X has the Uniform distribution if it has pdf and cdf
f(x) =
{1b−a , a ≤ x ≤ b0, otherwise
and F (x) =
0, x < axb−a , a ≤ x ≤ b1, x ≥ b.
Notation: X ∼ Unif(a, b).
Previous work showed that
E[X] =a+ b
2and Var(X) =
(a− b)2
12.
We can also derive the mgf,
MX(t) = E[etX ] =
∫ b
aetx
1
b− adx =
etb − eta
t(b− a).
ISYE 6739 — Goldsman 8/5/20 36 / 108
![Page 206: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/206.jpg)
Uniform, Exponential, and Friends
Lesson 4.5 — Uniform, Exponential, and Friends
Definition: The RV X has the Uniform distribution if it has pdf and cdf
f(x) =
{1b−a , a ≤ x ≤ b0, otherwise
and F (x) =
0, x < axb−a , a ≤ x ≤ b1, x ≥ b.
Notation: X ∼ Unif(a, b).
Previous work showed that
E[X] =a+ b
2and Var(X) =
(a− b)2
12.
We can also derive the mgf,
MX(t) = E[etX ] =
∫ b
aetx
1
b− adx =
etb − eta
t(b− a).
ISYE 6739 — Goldsman 8/5/20 36 / 108
![Page 207: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/207.jpg)
Uniform, Exponential, and Friends
Lesson 4.5 — Uniform, Exponential, and Friends
Definition: The RV X has the Uniform distribution if it has pdf and cdf
f(x) =
{1b−a , a ≤ x ≤ b0, otherwise
and F (x) =
0, x < axb−a , a ≤ x ≤ b1, x ≥ b.
Notation: X ∼ Unif(a, b).
Previous work showed that
E[X] =a+ b
2and Var(X) =
(a− b)2
12.
We can also derive the mgf,
MX(t) = E[etX ] =
∫ b
aetx
1
b− adx =
etb − eta
t(b− a).
ISYE 6739 — Goldsman 8/5/20 36 / 108
![Page 208: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/208.jpg)
Uniform, Exponential, and Friends
Lesson 4.5 — Uniform, Exponential, and Friends
Definition: The RV X has the Uniform distribution if it has pdf and cdf
f(x) =
{1b−a , a ≤ x ≤ b0, otherwise
and F (x) =
0, x < axb−a , a ≤ x ≤ b1, x ≥ b.
Notation: X ∼ Unif(a, b).
Previous work showed that
E[X] =a+ b
2and Var(X) =
(a− b)2
12.
We can also derive the mgf,
MX(t) = E[etX ] =
∫ b
aetx
1
b− adx =
etb − eta
t(b− a).
ISYE 6739 — Goldsman 8/5/20 36 / 108
![Page 209: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/209.jpg)
Uniform, Exponential, and Friends
Definition: The Exponential(λ) distribution has pdf
f(x) =
{λe−λx, x > 0
0, otherwise.
Previous work showed that the cdf F (x) = 1− e−λx,
E[X] = 1/λ, and Var(X) = 1/λ2.
We also derived the mgf,
MX(t) = E[etX ] =
∫ ∞0
etxf(x) dx =λ
λ− t, t < λ.
ISYE 6739 — Goldsman 8/5/20 37 / 108
![Page 210: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/210.jpg)
Uniform, Exponential, and Friends
Definition: The Exponential(λ) distribution has pdf
f(x) =
{λe−λx, x > 0
0, otherwise.
Previous work showed that the cdf F (x) = 1− e−λx,
E[X] = 1/λ, and Var(X) = 1/λ2.
We also derived the mgf,
MX(t) = E[etX ] =
∫ ∞0
etxf(x) dx =λ
λ− t, t < λ.
ISYE 6739 — Goldsman 8/5/20 37 / 108
![Page 211: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/211.jpg)
Uniform, Exponential, and Friends
Definition: The Exponential(λ) distribution has pdf
f(x) =
{λe−λx, x > 0
0, otherwise.
Previous work showed that the cdf F (x) = 1− e−λx,
E[X] = 1/λ, and Var(X) = 1/λ2.
We also derived the mgf,
MX(t) = E[etX ] =
∫ ∞0
etxf(x) dx =λ
λ− t, t < λ.
ISYE 6739 — Goldsman 8/5/20 37 / 108
![Page 212: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/212.jpg)
Uniform, Exponential, and Friends
Definition: The Exponential(λ) distribution has pdf
f(x) =
{λe−λx, x > 0
0, otherwise.
Previous work showed that the cdf F (x) = 1− e−λx,
E[X] = 1/λ, and Var(X) = 1/λ2.
We also derived the mgf,
MX(t) = E[etX ] =
∫ ∞0
etxf(x) dx =λ
λ− t, t < λ.
ISYE 6739 — Goldsman 8/5/20 37 / 108
![Page 213: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/213.jpg)
Uniform, Exponential, and Friends
Definition: The Exponential(λ) distribution has pdf
f(x) =
{λe−λx, x > 0
0, otherwise.
Previous work showed that the cdf F (x) = 1− e−λx,
E[X] = 1/λ, and Var(X) = 1/λ2.
We also derived the mgf,
MX(t) = E[etX ] =
∫ ∞0
etxf(x) dx =λ
λ− t, t < λ.
ISYE 6739 — Goldsman 8/5/20 37 / 108
![Page 214: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/214.jpg)
Uniform, Exponential, and Friends
Definition: The Exponential(λ) distribution has pdf
f(x) =
{λe−λx, x > 0
0, otherwise.
Previous work showed that the cdf F (x) = 1− e−λx,
E[X] = 1/λ, and Var(X) = 1/λ2.
We also derived the mgf,
MX(t) = E[etX ] =
∫ ∞0
etxf(x) dx =λ
λ− t, t < λ.
ISYE 6739 — Goldsman 8/5/20 37 / 108
![Page 215: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/215.jpg)
Uniform, Exponential, and Friends
Memoryless Property of Exponential
Theorem: Suppose that X ∼ Exp(λ). Then for positive s, t, we have
P (X > s+ t|X > s) = P (X > t).
Similar to the discrete Geometric distribution, the probability that X willsurvive an additional t time units is the (unconditional) probability that it willsurvive at least t — it forgot that it made it past time s! It’s always “like new”!
Proof:
P (X > s+ t|X > s) =P (X > s+ t ∩X > s)
P (X > s)=
P (X > s+ t)
P (X > s)
=e−λ(s+t)
e−λs= e−λt = P (X > t). 2
ISYE 6739 — Goldsman 8/5/20 38 / 108
![Page 216: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/216.jpg)
Uniform, Exponential, and Friends
Memoryless Property of Exponential
Theorem: Suppose that X ∼ Exp(λ). Then for positive s, t, we have
P (X > s+ t|X > s) = P (X > t).
Similar to the discrete Geometric distribution, the probability that X willsurvive an additional t time units is the (unconditional) probability that it willsurvive at least t — it forgot that it made it past time s! It’s always “like new”!
Proof:
P (X > s+ t|X > s) =P (X > s+ t ∩X > s)
P (X > s)=
P (X > s+ t)
P (X > s)
=e−λ(s+t)
e−λs= e−λt = P (X > t). 2
ISYE 6739 — Goldsman 8/5/20 38 / 108
![Page 217: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/217.jpg)
Uniform, Exponential, and Friends
Memoryless Property of Exponential
Theorem: Suppose that X ∼ Exp(λ). Then for positive s, t, we have
P (X > s+ t|X > s) = P (X > t).
Similar to the discrete Geometric distribution, the probability that X willsurvive an additional t time units is the (unconditional) probability that it willsurvive at least t — it forgot that it made it past time s! It’s always “like new”!
Proof:
P (X > s+ t|X > s) =P (X > s+ t ∩X > s)
P (X > s)=
P (X > s+ t)
P (X > s)
=e−λ(s+t)
e−λs= e−λt = P (X > t). 2
ISYE 6739 — Goldsman 8/5/20 38 / 108
![Page 218: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/218.jpg)
Uniform, Exponential, and Friends
Memoryless Property of Exponential
Theorem: Suppose that X ∼ Exp(λ). Then for positive s, t, we have
P (X > s+ t|X > s) = P (X > t).
Similar to the discrete Geometric distribution, the probability that X willsurvive an additional t time units is the (unconditional) probability that it willsurvive at least t
— it forgot that it made it past time s! It’s always “like new”!
Proof:
P (X > s+ t|X > s) =P (X > s+ t ∩X > s)
P (X > s)=
P (X > s+ t)
P (X > s)
=e−λ(s+t)
e−λs= e−λt = P (X > t). 2
ISYE 6739 — Goldsman 8/5/20 38 / 108
![Page 219: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/219.jpg)
Uniform, Exponential, and Friends
Memoryless Property of Exponential
Theorem: Suppose that X ∼ Exp(λ). Then for positive s, t, we have
P (X > s+ t|X > s) = P (X > t).
Similar to the discrete Geometric distribution, the probability that X willsurvive an additional t time units is the (unconditional) probability that it willsurvive at least t — it forgot that it made it past time s! It’s always “like new”!
Proof:
P (X > s+ t|X > s) =P (X > s+ t ∩X > s)
P (X > s)=
P (X > s+ t)
P (X > s)
=e−λ(s+t)
e−λs= e−λt = P (X > t). 2
ISYE 6739 — Goldsman 8/5/20 38 / 108
![Page 220: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/220.jpg)
Uniform, Exponential, and Friends
Memoryless Property of Exponential
Theorem: Suppose that X ∼ Exp(λ). Then for positive s, t, we have
P (X > s+ t|X > s) = P (X > t).
Similar to the discrete Geometric distribution, the probability that X willsurvive an additional t time units is the (unconditional) probability that it willsurvive at least t — it forgot that it made it past time s! It’s always “like new”!
Proof:
P (X > s+ t|X > s) =P (X > s+ t ∩X > s)
P (X > s)
=P (X > s+ t)
P (X > s)
=e−λ(s+t)
e−λs= e−λt = P (X > t). 2
ISYE 6739 — Goldsman 8/5/20 38 / 108
![Page 221: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/221.jpg)
Uniform, Exponential, and Friends
Memoryless Property of Exponential
Theorem: Suppose that X ∼ Exp(λ). Then for positive s, t, we have
P (X > s+ t|X > s) = P (X > t).
Similar to the discrete Geometric distribution, the probability that X willsurvive an additional t time units is the (unconditional) probability that it willsurvive at least t — it forgot that it made it past time s! It’s always “like new”!
Proof:
P (X > s+ t|X > s) =P (X > s+ t ∩X > s)
P (X > s)=
P (X > s+ t)
P (X > s)
=e−λ(s+t)
e−λs= e−λt = P (X > t). 2
ISYE 6739 — Goldsman 8/5/20 38 / 108
![Page 222: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/222.jpg)
Uniform, Exponential, and Friends
Memoryless Property of Exponential
Theorem: Suppose that X ∼ Exp(λ). Then for positive s, t, we have
P (X > s+ t|X > s) = P (X > t).
Similar to the discrete Geometric distribution, the probability that X willsurvive an additional t time units is the (unconditional) probability that it willsurvive at least t — it forgot that it made it past time s! It’s always “like new”!
Proof:
P (X > s+ t|X > s) =P (X > s+ t ∩X > s)
P (X > s)=
P (X > s+ t)
P (X > s)
=e−λ(s+t)
e−λs
= e−λt = P (X > t). 2
ISYE 6739 — Goldsman 8/5/20 38 / 108
![Page 223: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/223.jpg)
Uniform, Exponential, and Friends
Memoryless Property of Exponential
Theorem: Suppose that X ∼ Exp(λ). Then for positive s, t, we have
P (X > s+ t|X > s) = P (X > t).
Similar to the discrete Geometric distribution, the probability that X willsurvive an additional t time units is the (unconditional) probability that it willsurvive at least t — it forgot that it made it past time s! It’s always “like new”!
Proof:
P (X > s+ t|X > s) =P (X > s+ t ∩X > s)
P (X > s)=
P (X > s+ t)
P (X > s)
=e−λ(s+t)
e−λs= e−λt = P (X > t). 2
ISYE 6739 — Goldsman 8/5/20 38 / 108
![Page 224: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/224.jpg)
Uniform, Exponential, and Friends
Example: Suppose that the life of a lightbulb is exponential with a mean of10 months.
If the light survives 20 months, what’s the probability that it’llsurvive another 10?
P (X > 30|X > 20) = P (X > 10) = e−λx = e−(1/10)(10) = e−1. 2
Example: If the time to the next bus is exponentially distributed with a meanof 10 minutes, and you’ve already been waiting 20 minutes, you can expect towait 10 more. 2
Remark: The exponential is the only continuous distribution with theMemoryless Property.
Remark: Look at E[X] and Var(X) for the Geometric distribution and seehow they’re similar to those for the exponential. (Not a coincidence.)
ISYE 6739 — Goldsman 8/5/20 39 / 108
![Page 225: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/225.jpg)
Uniform, Exponential, and Friends
Example: Suppose that the life of a lightbulb is exponential with a mean of10 months. If the light survives 20 months, what’s the probability that it’llsurvive another 10?
P (X > 30|X > 20) = P (X > 10) = e−λx = e−(1/10)(10) = e−1. 2
Example: If the time to the next bus is exponentially distributed with a meanof 10 minutes, and you’ve already been waiting 20 minutes, you can expect towait 10 more. 2
Remark: The exponential is the only continuous distribution with theMemoryless Property.
Remark: Look at E[X] and Var(X) for the Geometric distribution and seehow they’re similar to those for the exponential. (Not a coincidence.)
ISYE 6739 — Goldsman 8/5/20 39 / 108
![Page 226: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/226.jpg)
Uniform, Exponential, and Friends
Example: Suppose that the life of a lightbulb is exponential with a mean of10 months. If the light survives 20 months, what’s the probability that it’llsurvive another 10?
P (X > 30|X > 20) = P (X > 10)
= e−λx = e−(1/10)(10) = e−1. 2
Example: If the time to the next bus is exponentially distributed with a meanof 10 minutes, and you’ve already been waiting 20 minutes, you can expect towait 10 more. 2
Remark: The exponential is the only continuous distribution with theMemoryless Property.
Remark: Look at E[X] and Var(X) for the Geometric distribution and seehow they’re similar to those for the exponential. (Not a coincidence.)
ISYE 6739 — Goldsman 8/5/20 39 / 108
![Page 227: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/227.jpg)
Uniform, Exponential, and Friends
Example: Suppose that the life of a lightbulb is exponential with a mean of10 months. If the light survives 20 months, what’s the probability that it’llsurvive another 10?
P (X > 30|X > 20) = P (X > 10) = e−λx
= e−(1/10)(10) = e−1. 2
Example: If the time to the next bus is exponentially distributed with a meanof 10 minutes, and you’ve already been waiting 20 minutes, you can expect towait 10 more. 2
Remark: The exponential is the only continuous distribution with theMemoryless Property.
Remark: Look at E[X] and Var(X) for the Geometric distribution and seehow they’re similar to those for the exponential. (Not a coincidence.)
ISYE 6739 — Goldsman 8/5/20 39 / 108
![Page 228: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/228.jpg)
Uniform, Exponential, and Friends
Example: Suppose that the life of a lightbulb is exponential with a mean of10 months. If the light survives 20 months, what’s the probability that it’llsurvive another 10?
P (X > 30|X > 20) = P (X > 10) = e−λx = e−(1/10)(10)
= e−1. 2
Example: If the time to the next bus is exponentially distributed with a meanof 10 minutes, and you’ve already been waiting 20 minutes, you can expect towait 10 more. 2
Remark: The exponential is the only continuous distribution with theMemoryless Property.
Remark: Look at E[X] and Var(X) for the Geometric distribution and seehow they’re similar to those for the exponential. (Not a coincidence.)
ISYE 6739 — Goldsman 8/5/20 39 / 108
![Page 229: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/229.jpg)
Uniform, Exponential, and Friends
Example: Suppose that the life of a lightbulb is exponential with a mean of10 months. If the light survives 20 months, what’s the probability that it’llsurvive another 10?
P (X > 30|X > 20) = P (X > 10) = e−λx = e−(1/10)(10) = e−1. 2
Example: If the time to the next bus is exponentially distributed with a meanof 10 minutes, and you’ve already been waiting 20 minutes, you can expect towait 10 more. 2
Remark: The exponential is the only continuous distribution with theMemoryless Property.
Remark: Look at E[X] and Var(X) for the Geometric distribution and seehow they’re similar to those for the exponential. (Not a coincidence.)
ISYE 6739 — Goldsman 8/5/20 39 / 108
![Page 230: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/230.jpg)
Uniform, Exponential, and Friends
Example: Suppose that the life of a lightbulb is exponential with a mean of10 months. If the light survives 20 months, what’s the probability that it’llsurvive another 10?
P (X > 30|X > 20) = P (X > 10) = e−λx = e−(1/10)(10) = e−1. 2
Example: If the time to the next bus is exponentially distributed with a meanof 10 minutes, and you’ve already been waiting 20 minutes, you can expect towait 10 more. 2
Remark: The exponential is the only continuous distribution with theMemoryless Property.
Remark: Look at E[X] and Var(X) for the Geometric distribution and seehow they’re similar to those for the exponential. (Not a coincidence.)
ISYE 6739 — Goldsman 8/5/20 39 / 108
![Page 231: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/231.jpg)
Uniform, Exponential, and Friends
Example: Suppose that the life of a lightbulb is exponential with a mean of10 months. If the light survives 20 months, what’s the probability that it’llsurvive another 10?
P (X > 30|X > 20) = P (X > 10) = e−λx = e−(1/10)(10) = e−1. 2
Example: If the time to the next bus is exponentially distributed with a meanof 10 minutes, and you’ve already been waiting 20 minutes, you can expect towait 10 more. 2
Remark: The exponential is the only continuous distribution with theMemoryless Property.
Remark: Look at E[X] and Var(X) for the Geometric distribution and seehow they’re similar to those for the exponential. (Not a coincidence.)
ISYE 6739 — Goldsman 8/5/20 39 / 108
![Page 232: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/232.jpg)
Uniform, Exponential, and Friends
Example: Suppose that the life of a lightbulb is exponential with a mean of10 months. If the light survives 20 months, what’s the probability that it’llsurvive another 10?
P (X > 30|X > 20) = P (X > 10) = e−λx = e−(1/10)(10) = e−1. 2
Example: If the time to the next bus is exponentially distributed with a meanof 10 minutes, and you’ve already been waiting 20 minutes, you can expect towait 10 more. 2
Remark: The exponential is the only continuous distribution with theMemoryless Property.
Remark: Look at E[X] and Var(X) for the Geometric distribution and seehow they’re similar to those for the exponential. (Not a coincidence.)
ISYE 6739 — Goldsman 8/5/20 39 / 108
![Page 233: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/233.jpg)
Uniform, Exponential, and Friends
Definition: If X is a cts RV with pdf f(x) and cdf F (x), then its failurerate function is
S(t) ≡ f(t)
P (X > t)=
f(t)
1− F (t),
which can loosely be regarded as X’s instantaneous rate of death, given that ithas so far survived to time t.
Example: If X ∼ Exp(λ), then S(t) = λe−λt/e−λt = λ. So if X is theexponential lifetime of a lightbulb, then its instantaneous burn-out rate isalways λ — always good as new! This is clearly a result of the MemorylessProperty. 2
ISYE 6739 — Goldsman 8/5/20 40 / 108
![Page 234: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/234.jpg)
Uniform, Exponential, and Friends
Definition: If X is a cts RV with pdf f(x) and cdf F (x), then its failurerate function is
S(t) ≡ f(t)
P (X > t)=
f(t)
1− F (t),
which can loosely be regarded as X’s instantaneous rate of death, given that ithas so far survived to time t.
Example: If X ∼ Exp(λ), then S(t) = λe−λt/e−λt = λ. So if X is theexponential lifetime of a lightbulb, then its instantaneous burn-out rate isalways λ — always good as new! This is clearly a result of the MemorylessProperty. 2
ISYE 6739 — Goldsman 8/5/20 40 / 108
![Page 235: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/235.jpg)
Uniform, Exponential, and Friends
Definition: If X is a cts RV with pdf f(x) and cdf F (x), then its failurerate function is
S(t) ≡ f(t)
P (X > t)=
f(t)
1− F (t),
which can loosely be regarded as X’s instantaneous rate of death, given that ithas so far survived to time t.
Example: If X ∼ Exp(λ), then S(t) = λe−λt/e−λt = λ. So if X is theexponential lifetime of a lightbulb, then its instantaneous burn-out rate isalways λ — always good as new! This is clearly a result of the MemorylessProperty. 2
ISYE 6739 — Goldsman 8/5/20 40 / 108
![Page 236: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/236.jpg)
Uniform, Exponential, and Friends
Definition: If X is a cts RV with pdf f(x) and cdf F (x), then its failurerate function is
S(t) ≡ f(t)
P (X > t)=
f(t)
1− F (t),
which can loosely be regarded as X’s instantaneous rate of death, given that ithas so far survived to time t.
Example: If X ∼ Exp(λ), then S(t) = λe−λt/e−λt = λ. So if X is theexponential lifetime of a lightbulb, then its instantaneous burn-out rate isalways λ — always good as new! This is clearly a result of the MemorylessProperty. 2
ISYE 6739 — Goldsman 8/5/20 40 / 108
![Page 237: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/237.jpg)
Uniform, Exponential, and Friends
The Exponential is also related to the Poisson!
Theorem: Let X be the amount of time until the first arrival in a Poissonprocess with rate λ. Then X ∼ Exp(λ).
Proof:
F (x) = P (X ≤ x) = 1− P (no arrivals in [0, x])
= 1− e−λx(λx)0
0!(since # arrivals in [0, x] is Pois(λx))
= 1− e−λx. 2
Theorem: Amazingly, it can be shown (after a lot of work) that theinterarrival times of a PP are all iid Exp(λ)! See for yourself when you take astochastic processes course.
ISYE 6739 — Goldsman 8/5/20 41 / 108
![Page 238: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/238.jpg)
Uniform, Exponential, and Friends
The Exponential is also related to the Poisson!
Theorem: Let X be the amount of time until the first arrival in a Poissonprocess with rate λ. Then X ∼ Exp(λ).
Proof:
F (x) = P (X ≤ x) = 1− P (no arrivals in [0, x])
= 1− e−λx(λx)0
0!(since # arrivals in [0, x] is Pois(λx))
= 1− e−λx. 2
Theorem: Amazingly, it can be shown (after a lot of work) that theinterarrival times of a PP are all iid Exp(λ)! See for yourself when you take astochastic processes course.
ISYE 6739 — Goldsman 8/5/20 41 / 108
![Page 239: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/239.jpg)
Uniform, Exponential, and Friends
The Exponential is also related to the Poisson!
Theorem: Let X be the amount of time until the first arrival in a Poissonprocess with rate λ. Then X ∼ Exp(λ).
Proof:
F (x) = P (X ≤ x) = 1− P (no arrivals in [0, x])
= 1− e−λx(λx)0
0!(since # arrivals in [0, x] is Pois(λx))
= 1− e−λx. 2
Theorem: Amazingly, it can be shown (after a lot of work) that theinterarrival times of a PP are all iid Exp(λ)! See for yourself when you take astochastic processes course.
ISYE 6739 — Goldsman 8/5/20 41 / 108
![Page 240: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/240.jpg)
Uniform, Exponential, and Friends
The Exponential is also related to the Poisson!
Theorem: Let X be the amount of time until the first arrival in a Poissonprocess with rate λ. Then X ∼ Exp(λ).
Proof:
F (x) = P (X ≤ x) = 1− P (no arrivals in [0, x])
= 1− e−λx(λx)0
0!(since # arrivals in [0, x] is Pois(λx))
= 1− e−λx. 2
Theorem: Amazingly, it can be shown (after a lot of work) that theinterarrival times of a PP are all iid Exp(λ)! See for yourself when you take astochastic processes course.
ISYE 6739 — Goldsman 8/5/20 41 / 108
![Page 241: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/241.jpg)
Uniform, Exponential, and Friends
The Exponential is also related to the Poisson!
Theorem: Let X be the amount of time until the first arrival in a Poissonprocess with rate λ. Then X ∼ Exp(λ).
Proof:
F (x) = P (X ≤ x) = 1− P (no arrivals in [0, x])
= 1− e−λx(λx)0
0!(since # arrivals in [0, x] is Pois(λx))
= 1− e−λx. 2
Theorem: Amazingly, it can be shown (after a lot of work) that theinterarrival times of a PP are all iid Exp(λ)! See for yourself when you take astochastic processes course.
ISYE 6739 — Goldsman 8/5/20 41 / 108
![Page 242: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/242.jpg)
Uniform, Exponential, and Friends
The Exponential is also related to the Poisson!
Theorem: Let X be the amount of time until the first arrival in a Poissonprocess with rate λ. Then X ∼ Exp(λ).
Proof:
F (x) = P (X ≤ x) = 1− P (no arrivals in [0, x])
= 1− e−λx(λx)0
0!(since # arrivals in [0, x] is Pois(λx))
= 1− e−λx. 2
Theorem: Amazingly, it can be shown (after a lot of work) that theinterarrival times of a PP are all iid Exp(λ)! See for yourself when you take astochastic processes course.
ISYE 6739 — Goldsman 8/5/20 41 / 108
![Page 243: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/243.jpg)
Uniform, Exponential, and Friends
Example: Suppose that arrivals to a shopping center are from a PP with rateλ = 20/hr.
What’s the probability that the time between the 13th and 14thcustomers will be at least 4 minutes?
Let the time between customers 13 and 14 be X . Since we have a PP, theinterarrivals are iid Exp(λ = 20/hr), so
P (X > 4 min) = P (X > 1/15 hr) = e−λt = e−20/15. 2
ISYE 6739 — Goldsman 8/5/20 42 / 108
![Page 244: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/244.jpg)
Uniform, Exponential, and Friends
Example: Suppose that arrivals to a shopping center are from a PP with rateλ = 20/hr. What’s the probability that the time between the 13th and 14thcustomers will be at least 4 minutes?
Let the time between customers 13 and 14 be X . Since we have a PP, theinterarrivals are iid Exp(λ = 20/hr), so
P (X > 4 min) = P (X > 1/15 hr) = e−λt = e−20/15. 2
ISYE 6739 — Goldsman 8/5/20 42 / 108
![Page 245: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/245.jpg)
Uniform, Exponential, and Friends
Example: Suppose that arrivals to a shopping center are from a PP with rateλ = 20/hr. What’s the probability that the time between the 13th and 14thcustomers will be at least 4 minutes?
Let the time between customers 13 and 14 be X . Since we have a PP, theinterarrivals are iid Exp(λ = 20/hr), so
P (X > 4 min) = P (X > 1/15 hr) = e−λt = e−20/15. 2
ISYE 6739 — Goldsman 8/5/20 42 / 108
![Page 246: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/246.jpg)
Uniform, Exponential, and Friends
Example: Suppose that arrivals to a shopping center are from a PP with rateλ = 20/hr. What’s the probability that the time between the 13th and 14thcustomers will be at least 4 minutes?
Let the time between customers 13 and 14 be X . Since we have a PP, theinterarrivals are iid Exp(λ = 20/hr), so
P (X > 4 min) = P (X > 1/15 hr) = e−λt = e−20/15. 2
ISYE 6739 — Goldsman 8/5/20 42 / 108
![Page 247: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/247.jpg)
Uniform, Exponential, and Friends
Definition/Theorem: Suppose X1, . . . , Xkiid∼ Exp(λ), and let
S =∑k
i=1Xi. Then S has the Erlang distribution with parameters k andλ, denoted S ∼ Erlangk(λ).
The Erlang is simply the sum of iid exponentials.
Special Case: Erlang1(λ) ∼ Exp(λ).
The pdf and cdf of the Erlang are
f(s) =λke−λssk−1
(k − 1)!, s ≥ 0, and F (s) = 1−
k−1∑i=0
e−λs(λs)i
i!.
Notice that the cdf is the sum of a bunch of Poisson probabilities. (Won’t do ithere, but this observation helps in the derivation of the cdf.)
ISYE 6739 — Goldsman 8/5/20 43 / 108
![Page 248: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/248.jpg)
Uniform, Exponential, and Friends
Definition/Theorem: Suppose X1, . . . , Xkiid∼ Exp(λ), and let
S =∑k
i=1Xi. Then S has the Erlang distribution with parameters k andλ, denoted S ∼ Erlangk(λ).
The Erlang is simply the sum of iid exponentials.
Special Case: Erlang1(λ) ∼ Exp(λ).
The pdf and cdf of the Erlang are
f(s) =λke−λssk−1
(k − 1)!, s ≥ 0, and F (s) = 1−
k−1∑i=0
e−λs(λs)i
i!.
Notice that the cdf is the sum of a bunch of Poisson probabilities. (Won’t do ithere, but this observation helps in the derivation of the cdf.)
ISYE 6739 — Goldsman 8/5/20 43 / 108
![Page 249: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/249.jpg)
Uniform, Exponential, and Friends
Definition/Theorem: Suppose X1, . . . , Xkiid∼ Exp(λ), and let
S =∑k
i=1Xi. Then S has the Erlang distribution with parameters k andλ, denoted S ∼ Erlangk(λ).
The Erlang is simply the sum of iid exponentials.
Special Case: Erlang1(λ) ∼ Exp(λ).
The pdf and cdf of the Erlang are
f(s) =λke−λssk−1
(k − 1)!, s ≥ 0, and F (s) = 1−
k−1∑i=0
e−λs(λs)i
i!.
Notice that the cdf is the sum of a bunch of Poisson probabilities. (Won’t do ithere, but this observation helps in the derivation of the cdf.)
ISYE 6739 — Goldsman 8/5/20 43 / 108
![Page 250: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/250.jpg)
Uniform, Exponential, and Friends
Definition/Theorem: Suppose X1, . . . , Xkiid∼ Exp(λ), and let
S =∑k
i=1Xi. Then S has the Erlang distribution with parameters k andλ, denoted S ∼ Erlangk(λ).
The Erlang is simply the sum of iid exponentials.
Special Case: Erlang1(λ) ∼ Exp(λ).
The pdf and cdf of the Erlang are
f(s) =λke−λssk−1
(k − 1)!, s ≥ 0, and F (s) = 1−
k−1∑i=0
e−λs(λs)i
i!.
Notice that the cdf is the sum of a bunch of Poisson probabilities. (Won’t do ithere, but this observation helps in the derivation of the cdf.)
ISYE 6739 — Goldsman 8/5/20 43 / 108
![Page 251: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/251.jpg)
Uniform, Exponential, and Friends
Definition/Theorem: Suppose X1, . . . , Xkiid∼ Exp(λ), and let
S =∑k
i=1Xi. Then S has the Erlang distribution with parameters k andλ, denoted S ∼ Erlangk(λ).
The Erlang is simply the sum of iid exponentials.
Special Case: Erlang1(λ) ∼ Exp(λ).
The pdf and cdf of the Erlang are
f(s) =λke−λssk−1
(k − 1)!, s ≥ 0, and
F (s) = 1−k−1∑i=0
e−λs(λs)i
i!.
Notice that the cdf is the sum of a bunch of Poisson probabilities. (Won’t do ithere, but this observation helps in the derivation of the cdf.)
ISYE 6739 — Goldsman 8/5/20 43 / 108
![Page 252: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/252.jpg)
Uniform, Exponential, and Friends
Definition/Theorem: Suppose X1, . . . , Xkiid∼ Exp(λ), and let
S =∑k
i=1Xi. Then S has the Erlang distribution with parameters k andλ, denoted S ∼ Erlangk(λ).
The Erlang is simply the sum of iid exponentials.
Special Case: Erlang1(λ) ∼ Exp(λ).
The pdf and cdf of the Erlang are
f(s) =λke−λssk−1
(k − 1)!, s ≥ 0, and F (s) = 1−
k−1∑i=0
e−λs(λs)i
i!.
Notice that the cdf is the sum of a bunch of Poisson probabilities. (Won’t do ithere, but this observation helps in the derivation of the cdf.)
ISYE 6739 — Goldsman 8/5/20 43 / 108
![Page 253: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/253.jpg)
Uniform, Exponential, and Friends
Definition/Theorem: Suppose X1, . . . , Xkiid∼ Exp(λ), and let
S =∑k
i=1Xi. Then S has the Erlang distribution with parameters k andλ, denoted S ∼ Erlangk(λ).
The Erlang is simply the sum of iid exponentials.
Special Case: Erlang1(λ) ∼ Exp(λ).
The pdf and cdf of the Erlang are
f(s) =λke−λssk−1
(k − 1)!, s ≥ 0, and F (s) = 1−
k−1∑i=0
e−λs(λs)i
i!.
Notice that the cdf is the sum of a bunch of Poisson probabilities. (Won’t do ithere, but this observation helps in the derivation of the cdf.)
ISYE 6739 — Goldsman 8/5/20 43 / 108
![Page 254: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/254.jpg)
Uniform, Exponential, and Friends
Expected value, variance, and mgf:
E[S] = E[ k∑i=1
Xi
]=
k∑i=1
E[Xi] = k/λ
Var(S) = k/λ2
MS(t) =
(λ
λ− t
)k.
Example: Suppose X and Y are iid Exp(2). Find P (X + Y < 1).
P (X + Y < 1) = 1−k−1∑i=0
e−λs(λs)i
i!= 1−
2−1∑i=0
e−(2·1)(2 · 1)i
i!= 0.594. 2
ISYE 6739 — Goldsman 8/5/20 44 / 108
![Page 255: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/255.jpg)
Uniform, Exponential, and Friends
Expected value, variance, and mgf:
E[S] = E[ k∑i=1
Xi
]=
k∑i=1
E[Xi] = k/λ
Var(S) = k/λ2
MS(t) =
(λ
λ− t
)k.
Example: Suppose X and Y are iid Exp(2). Find P (X + Y < 1).
P (X + Y < 1) = 1−k−1∑i=0
e−λs(λs)i
i!= 1−
2−1∑i=0
e−(2·1)(2 · 1)i
i!= 0.594. 2
ISYE 6739 — Goldsman 8/5/20 44 / 108
![Page 256: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/256.jpg)
Uniform, Exponential, and Friends
Expected value, variance, and mgf:
E[S] = E[ k∑i=1
Xi
]=
k∑i=1
E[Xi] = k/λ
Var(S) = k/λ2
MS(t) =
(λ
λ− t
)k.
Example: Suppose X and Y are iid Exp(2). Find P (X + Y < 1).
P (X + Y < 1) = 1−k−1∑i=0
e−λs(λs)i
i!= 1−
2−1∑i=0
e−(2·1)(2 · 1)i
i!= 0.594. 2
ISYE 6739 — Goldsman 8/5/20 44 / 108
![Page 257: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/257.jpg)
Uniform, Exponential, and Friends
Expected value, variance, and mgf:
E[S] = E[ k∑i=1
Xi
]=
k∑i=1
E[Xi] = k/λ
Var(S) = k/λ2
MS(t) =
(λ
λ− t
)k.
Example: Suppose X and Y are iid Exp(2). Find P (X + Y < 1).
P (X + Y < 1) = 1−k−1∑i=0
e−λs(λs)i
i!= 1−
2−1∑i=0
e−(2·1)(2 · 1)i
i!= 0.594. 2
ISYE 6739 — Goldsman 8/5/20 44 / 108
![Page 258: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/258.jpg)
Uniform, Exponential, and Friends
Expected value, variance, and mgf:
E[S] = E[ k∑i=1
Xi
]=
k∑i=1
E[Xi] = k/λ
Var(S) = k/λ2
MS(t) =
(λ
λ− t
)k.
Example: Suppose X and Y are iid Exp(2). Find P (X + Y < 1).
P (X + Y < 1) = 1−k−1∑i=0
e−λs(λs)i
i!
= 1−2−1∑i=0
e−(2·1)(2 · 1)i
i!= 0.594. 2
ISYE 6739 — Goldsman 8/5/20 44 / 108
![Page 259: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/259.jpg)
Uniform, Exponential, and Friends
Expected value, variance, and mgf:
E[S] = E[ k∑i=1
Xi
]=
k∑i=1
E[Xi] = k/λ
Var(S) = k/λ2
MS(t) =
(λ
λ− t
)k.
Example: Suppose X and Y are iid Exp(2). Find P (X + Y < 1).
P (X + Y < 1) = 1−k−1∑i=0
e−λs(λs)i
i!= 1−
2−1∑i=0
e−(2·1)(2 · 1)i
i!= 0.594. 2
ISYE 6739 — Goldsman 8/5/20 44 / 108
![Page 260: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/260.jpg)
Uniform, Exponential, and Friends
Definition: X has the Gamma distribution with parameters α > 0 andλ > 0 if it has pdf
f(x) =λαe−λxxα−1
Γ(α), x ≥ 0,
whereΓ(α) ≡
∫ ∞0
tα−1e−t dt
is the gamma function.
Remark: The Gamma distribution generalizes the Erlang distribution (whereα has to be a positive integer). It has the same expected value and variance asthe Erlang, with α in place of k.
Remark: If α is a positive integer, then Γ(α) = (α− 1)!. Party trick:Γ(1/2) =
√π.
ISYE 6739 — Goldsman 8/5/20 45 / 108
![Page 261: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/261.jpg)
Uniform, Exponential, and Friends
Definition: X has the Gamma distribution with parameters α > 0 andλ > 0 if it has pdf
f(x) =λαe−λxxα−1
Γ(α), x ≥ 0,
whereΓ(α) ≡
∫ ∞0
tα−1e−t dt
is the gamma function.
Remark: The Gamma distribution generalizes the Erlang distribution (whereα has to be a positive integer). It has the same expected value and variance asthe Erlang, with α in place of k.
Remark: If α is a positive integer, then Γ(α) = (α− 1)!. Party trick:Γ(1/2) =
√π.
ISYE 6739 — Goldsman 8/5/20 45 / 108
![Page 262: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/262.jpg)
Uniform, Exponential, and Friends
Definition: X has the Gamma distribution with parameters α > 0 andλ > 0 if it has pdf
f(x) =λαe−λxxα−1
Γ(α), x ≥ 0,
whereΓ(α) ≡
∫ ∞0
tα−1e−t dt
is the gamma function.
Remark: The Gamma distribution generalizes the Erlang distribution (whereα has to be a positive integer). It has the same expected value and variance asthe Erlang, with α in place of k.
Remark: If α is a positive integer, then Γ(α) = (α− 1)!. Party trick:Γ(1/2) =
√π.
ISYE 6739 — Goldsman 8/5/20 45 / 108
![Page 263: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/263.jpg)
Uniform, Exponential, and Friends
Definition: X has the Gamma distribution with parameters α > 0 andλ > 0 if it has pdf
f(x) =λαe−λxxα−1
Γ(α), x ≥ 0,
whereΓ(α) ≡
∫ ∞0
tα−1e−t dt
is the gamma function.
Remark: The Gamma distribution generalizes the Erlang distribution (whereα has to be a positive integer). It has the same expected value and variance asthe Erlang, with α in place of k.
Remark: If α is a positive integer, then Γ(α) = (α− 1)!. Party trick:Γ(1/2) =
√π.
ISYE 6739 — Goldsman 8/5/20 45 / 108
![Page 264: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/264.jpg)
Uniform, Exponential, and Friends
Definition: X has the Gamma distribution with parameters α > 0 andλ > 0 if it has pdf
f(x) =λαe−λxxα−1
Γ(α), x ≥ 0,
whereΓ(α) ≡
∫ ∞0
tα−1e−t dt
is the gamma function.
Remark: The Gamma distribution generalizes the Erlang distribution (whereα has to be a positive integer). It has the same expected value and variance asthe Erlang, with α in place of k.
Remark: If α is a positive integer, then Γ(α) = (α− 1)!. Party trick:Γ(1/2) =
√π.
ISYE 6739 — Goldsman 8/5/20 45 / 108
![Page 265: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/265.jpg)
Other Continuous Distributions
1 Bernoulli and Binomial Distributions
2 Hypergeometric Distribution3 Geometric and Negative Binomial Distributions
4 Poisson Distribution5 Uniform, Exponential, and Friends6 Other Continuous Distributions7 Normal Distribution: Basics8 Standard Normal Distribution9 Sample Mean of Normals
10 The Central Limit Theorem + Proof
11 Central Limit Theorem Examples
12 Extensions — Multivariate Normal Distribution13 Extensions — Lognormal Distribution
14 Computer Stuff
ISYE 6739 — Goldsman 8/5/20 46 / 108
![Page 266: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/266.jpg)
Other Continuous Distributions
Lesson 4.6 — Other Continuous Distributions
Triangular(a, b, c) Distribution — good for modeling RVs on the basis oflimited data (minimum, mode, maximum).
f(x) =
2(x−a)
(b−a)(c−a) , a < x ≤ b2(c−x)
(c−b)(c−a) , b < x < c
0, otherwise.
E[X] =a+ b+ c
3and Var(X) = mess
ISYE 6739 — Goldsman 8/5/20 47 / 108
![Page 267: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/267.jpg)
Other Continuous Distributions
Lesson 4.6 — Other Continuous Distributions
Triangular(a, b, c) Distribution — good for modeling RVs on the basis oflimited data (minimum, mode, maximum).
f(x) =
2(x−a)
(b−a)(c−a) , a < x ≤ b2(c−x)
(c−b)(c−a) , b < x < c
0, otherwise.
E[X] =a+ b+ c
3and Var(X) = mess
ISYE 6739 — Goldsman 8/5/20 47 / 108
![Page 268: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/268.jpg)
Other Continuous Distributions
Lesson 4.6 — Other Continuous Distributions
Triangular(a, b, c) Distribution — good for modeling RVs on the basis oflimited data (minimum, mode, maximum).
f(x) =
2(x−a)
(b−a)(c−a) , a < x ≤ b2(c−x)
(c−b)(c−a) , b < x < c
0, otherwise.
E[X] =a+ b+ c
3and Var(X) = mess
ISYE 6739 — Goldsman 8/5/20 47 / 108
![Page 269: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/269.jpg)
Other Continuous Distributions
Lesson 4.6 — Other Continuous Distributions
Triangular(a, b, c) Distribution — good for modeling RVs on the basis oflimited data (minimum, mode, maximum).
f(x) =
2(x−a)
(b−a)(c−a) , a < x ≤ b2(c−x)
(c−b)(c−a) , b < x < c
0, otherwise.
E[X] =a+ b+ c
3and
Var(X) = mess
ISYE 6739 — Goldsman 8/5/20 47 / 108
![Page 270: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/270.jpg)
Other Continuous Distributions
Lesson 4.6 — Other Continuous Distributions
Triangular(a, b, c) Distribution — good for modeling RVs on the basis oflimited data (minimum, mode, maximum).
f(x) =
2(x−a)
(b−a)(c−a) , a < x ≤ b2(c−x)
(c−b)(c−a) , b < x < c
0, otherwise.
E[X] =a+ b+ c
3and Var(X) = mess
ISYE 6739 — Goldsman 8/5/20 47 / 108
![Page 271: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/271.jpg)
Other Continuous Distributions
Beta(a, b) Distribution — good for modeling RVs that are restricted to aninterval.
f(x) =Γ(a+ b)
Γ(a)Γ(b)xa−1(1− x)b−1, 0 < x < 1.
E[X] =a
a+ band Var(X) =
ab
(a+ b)2(a+ b+ 1).
This distribution gets its name from the beta function, which is defined as
β(a, b) ≡ Γ(a)Γ(b)
Γ(a+ b)=
∫ 1
0xa−1(1− x)b−1 dx.
ISYE 6739 — Goldsman 8/5/20 48 / 108
![Page 272: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/272.jpg)
Other Continuous Distributions
Beta(a, b) Distribution — good for modeling RVs that are restricted to aninterval.
f(x) =Γ(a+ b)
Γ(a)Γ(b)xa−1(1− x)b−1, 0 < x < 1.
E[X] =a
a+ band Var(X) =
ab
(a+ b)2(a+ b+ 1).
This distribution gets its name from the beta function, which is defined as
β(a, b) ≡ Γ(a)Γ(b)
Γ(a+ b)=
∫ 1
0xa−1(1− x)b−1 dx.
ISYE 6739 — Goldsman 8/5/20 48 / 108
![Page 273: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/273.jpg)
Other Continuous Distributions
Beta(a, b) Distribution — good for modeling RVs that are restricted to aninterval.
f(x) =Γ(a+ b)
Γ(a)Γ(b)xa−1(1− x)b−1, 0 < x < 1.
E[X] =a
a+ band
Var(X) =ab
(a+ b)2(a+ b+ 1).
This distribution gets its name from the beta function, which is defined as
β(a, b) ≡ Γ(a)Γ(b)
Γ(a+ b)=
∫ 1
0xa−1(1− x)b−1 dx.
ISYE 6739 — Goldsman 8/5/20 48 / 108
![Page 274: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/274.jpg)
Other Continuous Distributions
Beta(a, b) Distribution — good for modeling RVs that are restricted to aninterval.
f(x) =Γ(a+ b)
Γ(a)Γ(b)xa−1(1− x)b−1, 0 < x < 1.
E[X] =a
a+ band Var(X) =
ab
(a+ b)2(a+ b+ 1).
This distribution gets its name from the beta function, which is defined as
β(a, b) ≡ Γ(a)Γ(b)
Γ(a+ b)=
∫ 1
0xa−1(1− x)b−1 dx.
ISYE 6739 — Goldsman 8/5/20 48 / 108
![Page 275: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/275.jpg)
Other Continuous Distributions
Beta(a, b) Distribution — good for modeling RVs that are restricted to aninterval.
f(x) =Γ(a+ b)
Γ(a)Γ(b)xa−1(1− x)b−1, 0 < x < 1.
E[X] =a
a+ band Var(X) =
ab
(a+ b)2(a+ b+ 1).
This distribution gets its name from the beta function, which is defined as
β(a, b) ≡ Γ(a)Γ(b)
Γ(a+ b)=
∫ 1
0xa−1(1− x)b−1 dx.
ISYE 6739 — Goldsman 8/5/20 48 / 108
![Page 276: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/276.jpg)
Other Continuous Distributions
Beta(a, b) Distribution — good for modeling RVs that are restricted to aninterval.
f(x) =Γ(a+ b)
Γ(a)Γ(b)xa−1(1− x)b−1, 0 < x < 1.
E[X] =a
a+ band Var(X) =
ab
(a+ b)2(a+ b+ 1).
This distribution gets its name from the beta function, which is defined as
β(a, b) ≡ Γ(a)Γ(b)
Γ(a+ b)=
∫ 1
0xa−1(1− x)b−1 dx.
ISYE 6739 — Goldsman 8/5/20 48 / 108
![Page 277: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/277.jpg)
Other Continuous Distributions
The Beta distribution is very flexible. Here are a few family portraits.
Remark: Certain versions of the Uniform (a = b = 1) and Triangulardistributions are special cases of the Beta.
ISYE 6739 — Goldsman 8/5/20 49 / 108
![Page 278: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/278.jpg)
Other Continuous Distributions
The Beta distribution is very flexible. Here are a few family portraits.
Remark: Certain versions of the Uniform (a = b = 1) and Triangulardistributions are special cases of the Beta.
ISYE 6739 — Goldsman 8/5/20 49 / 108
![Page 279: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/279.jpg)
Other Continuous Distributions
The Beta distribution is very flexible. Here are a few family portraits.
Remark: Certain versions of the Uniform (a = b = 1) and Triangulardistributions are special cases of the Beta.
ISYE 6739 — Goldsman 8/5/20 49 / 108
![Page 280: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/280.jpg)
Other Continuous Distributions
Weibull(a, b) Distribution — good for modeling reliability models. a isthe “scale” parameter, and b is the “shape” parameter.
f(x) = ab(ax)b−1e−(ax)b, x > 0.
F (x) = 1− exp[−(ax)b], x > 0.
E[X] = (1/a)Γ(1 + (1/b)) and Var(X) = slight mess
Remark: The Exponential is a special case of the Weibull.
Example: Time-to-failure T for a transmitter has a Weibull distribution withparameters a = 1/(200 hrs) and b = 1/3. Then
E[T ] = 200Γ(1 + 3) = 1200 hrs.
The probability that it fails before 2000 hrs is
F (2000) = 1− exp[−(2000/200)1/3] = 0.884. 2
ISYE 6739 — Goldsman 8/5/20 50 / 108
![Page 281: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/281.jpg)
Other Continuous Distributions
Weibull(a, b) Distribution — good for modeling reliability models. a isthe “scale” parameter, and b is the “shape” parameter.
f(x) = ab(ax)b−1e−(ax)b, x > 0.
F (x) = 1− exp[−(ax)b], x > 0.
E[X] = (1/a)Γ(1 + (1/b)) and Var(X) = slight mess
Remark: The Exponential is a special case of the Weibull.
Example: Time-to-failure T for a transmitter has a Weibull distribution withparameters a = 1/(200 hrs) and b = 1/3. Then
E[T ] = 200Γ(1 + 3) = 1200 hrs.
The probability that it fails before 2000 hrs is
F (2000) = 1− exp[−(2000/200)1/3] = 0.884. 2
ISYE 6739 — Goldsman 8/5/20 50 / 108
![Page 282: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/282.jpg)
Other Continuous Distributions
Weibull(a, b) Distribution — good for modeling reliability models. a isthe “scale” parameter, and b is the “shape” parameter.
f(x) = ab(ax)b−1e−(ax)b, x > 0.
F (x) = 1− exp[−(ax)b], x > 0.
E[X] = (1/a)Γ(1 + (1/b)) and Var(X) = slight mess
Remark: The Exponential is a special case of the Weibull.
Example: Time-to-failure T for a transmitter has a Weibull distribution withparameters a = 1/(200 hrs) and b = 1/3. Then
E[T ] = 200Γ(1 + 3) = 1200 hrs.
The probability that it fails before 2000 hrs is
F (2000) = 1− exp[−(2000/200)1/3] = 0.884. 2
ISYE 6739 — Goldsman 8/5/20 50 / 108
![Page 283: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/283.jpg)
Other Continuous Distributions
Weibull(a, b) Distribution — good for modeling reliability models. a isthe “scale” parameter, and b is the “shape” parameter.
f(x) = ab(ax)b−1e−(ax)b, x > 0.
F (x) = 1− exp[−(ax)b], x > 0.
E[X] = (1/a)Γ(1 + (1/b)) and Var(X) = slight mess
Remark: The Exponential is a special case of the Weibull.
Example: Time-to-failure T for a transmitter has a Weibull distribution withparameters a = 1/(200 hrs) and b = 1/3. Then
E[T ] = 200Γ(1 + 3) = 1200 hrs.
The probability that it fails before 2000 hrs is
F (2000) = 1− exp[−(2000/200)1/3] = 0.884. 2
ISYE 6739 — Goldsman 8/5/20 50 / 108
![Page 284: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/284.jpg)
Other Continuous Distributions
Weibull(a, b) Distribution — good for modeling reliability models. a isthe “scale” parameter, and b is the “shape” parameter.
f(x) = ab(ax)b−1e−(ax)b, x > 0.
F (x) = 1− exp[−(ax)b], x > 0.
E[X] = (1/a)Γ(1 + (1/b)) and Var(X) = slight mess
Remark: The Exponential is a special case of the Weibull.
Example: Time-to-failure T for a transmitter has a Weibull distribution withparameters a = 1/(200 hrs) and b = 1/3. Then
E[T ] = 200Γ(1 + 3) = 1200 hrs.
The probability that it fails before 2000 hrs is
F (2000) = 1− exp[−(2000/200)1/3] = 0.884. 2
ISYE 6739 — Goldsman 8/5/20 50 / 108
![Page 285: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/285.jpg)
Other Continuous Distributions
Weibull(a, b) Distribution — good for modeling reliability models. a isthe “scale” parameter, and b is the “shape” parameter.
f(x) = ab(ax)b−1e−(ax)b, x > 0.
F (x) = 1− exp[−(ax)b], x > 0.
E[X] = (1/a)Γ(1 + (1/b)) and Var(X) = slight mess
Remark: The Exponential is a special case of the Weibull.
Example: Time-to-failure T for a transmitter has a Weibull distribution withparameters a = 1/(200 hrs) and b = 1/3. Then
E[T ] = 200Γ(1 + 3) = 1200 hrs.
The probability that it fails before 2000 hrs is
F (2000) = 1− exp[−(2000/200)1/3] = 0.884. 2
ISYE 6739 — Goldsman 8/5/20 50 / 108
![Page 286: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/286.jpg)
Other Continuous Distributions
Weibull(a, b) Distribution — good for modeling reliability models. a isthe “scale” parameter, and b is the “shape” parameter.
f(x) = ab(ax)b−1e−(ax)b, x > 0.
F (x) = 1− exp[−(ax)b], x > 0.
E[X] = (1/a)Γ(1 + (1/b)) and Var(X) = slight mess
Remark: The Exponential is a special case of the Weibull.
Example: Time-to-failure T for a transmitter has a Weibull distribution withparameters a = 1/(200 hrs) and b = 1/3. Then
E[T ] = 200Γ(1 + 3) = 1200 hrs.
The probability that it fails before 2000 hrs is
F (2000) = 1− exp[−(2000/200)1/3] = 0.884. 2
ISYE 6739 — Goldsman 8/5/20 50 / 108
![Page 287: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/287.jpg)
Other Continuous Distributions
Weibull(a, b) Distribution — good for modeling reliability models. a isthe “scale” parameter, and b is the “shape” parameter.
f(x) = ab(ax)b−1e−(ax)b, x > 0.
F (x) = 1− exp[−(ax)b], x > 0.
E[X] = (1/a)Γ(1 + (1/b)) and Var(X) = slight mess
Remark: The Exponential is a special case of the Weibull.
Example: Time-to-failure T for a transmitter has a Weibull distribution withparameters a = 1/(200 hrs) and b = 1/3. Then
E[T ] = 200Γ(1 + 3) = 1200 hrs.
The probability that it fails before 2000 hrs is
F (2000) = 1− exp[−(2000/200)1/3] = 0.884. 2
ISYE 6739 — Goldsman 8/5/20 50 / 108
![Page 288: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/288.jpg)
Other Continuous Distributions
Cauchy Distribution — A “fat-tailed” distribution good for disprovingthings!
f(x) =1
π(1 + x2)and F (x) =
1
2+
arctan(x)
π, x ∈ R.
Theorem: The Cauchy distribution has an undefined mean and infinitevariance!
Weird Fact: X1, . . . , Xniid∼ Cauchy⇒
∑ni=1Xi/n ∼ Cauchy. Even if you
take the average of a bunch of Cauchys, you’re right back where you started!
ISYE 6739 — Goldsman 8/5/20 51 / 108
![Page 289: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/289.jpg)
Other Continuous Distributions
Cauchy Distribution — A “fat-tailed” distribution good for disprovingthings!
f(x) =1
π(1 + x2)and
F (x) =1
2+
arctan(x)
π, x ∈ R.
Theorem: The Cauchy distribution has an undefined mean and infinitevariance!
Weird Fact: X1, . . . , Xniid∼ Cauchy⇒
∑ni=1Xi/n ∼ Cauchy. Even if you
take the average of a bunch of Cauchys, you’re right back where you started!
ISYE 6739 — Goldsman 8/5/20 51 / 108
![Page 290: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/290.jpg)
Other Continuous Distributions
Cauchy Distribution — A “fat-tailed” distribution good for disprovingthings!
f(x) =1
π(1 + x2)and F (x) =
1
2+
arctan(x)
π, x ∈ R.
Theorem: The Cauchy distribution has an undefined mean and infinitevariance!
Weird Fact: X1, . . . , Xniid∼ Cauchy⇒
∑ni=1Xi/n ∼ Cauchy. Even if you
take the average of a bunch of Cauchys, you’re right back where you started!
ISYE 6739 — Goldsman 8/5/20 51 / 108
![Page 291: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/291.jpg)
Other Continuous Distributions
Cauchy Distribution — A “fat-tailed” distribution good for disprovingthings!
f(x) =1
π(1 + x2)and F (x) =
1
2+
arctan(x)
π, x ∈ R.
Theorem: The Cauchy distribution has an undefined mean and infinitevariance!
Weird Fact: X1, . . . , Xniid∼ Cauchy⇒
∑ni=1Xi/n ∼ Cauchy. Even if you
take the average of a bunch of Cauchys, you’re right back where you started!
ISYE 6739 — Goldsman 8/5/20 51 / 108
![Page 292: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/292.jpg)
Other Continuous Distributions
Cauchy Distribution — A “fat-tailed” distribution good for disprovingthings!
f(x) =1
π(1 + x2)and F (x) =
1
2+
arctan(x)
π, x ∈ R.
Theorem: The Cauchy distribution has an undefined mean and infinitevariance!
Weird Fact: X1, . . . , Xniid∼ Cauchy⇒
∑ni=1Xi/n ∼ Cauchy. Even if you
take the average of a bunch of Cauchys, you’re right back where you started!
ISYE 6739 — Goldsman 8/5/20 51 / 108
![Page 293: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/293.jpg)
Other Continuous Distributions
Alphabet Soup of Other Distributions
χ2 distribution — coming up when we talk Statistics
t distribution — coming up
F distribution — coming up
Pareto, LaPlace, Rayleigh, Gumbel, Johnson distributions
Etc. . . .
ISYE 6739 — Goldsman 8/5/20 52 / 108
![Page 294: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/294.jpg)
Other Continuous Distributions
Alphabet Soup of Other Distributions
χ2 distribution — coming up when we talk Statistics
t distribution — coming up
F distribution — coming up
Pareto, LaPlace, Rayleigh, Gumbel, Johnson distributions
Etc. . . .
ISYE 6739 — Goldsman 8/5/20 52 / 108
![Page 295: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/295.jpg)
Other Continuous Distributions
Alphabet Soup of Other Distributions
χ2 distribution — coming up when we talk Statistics
t distribution — coming up
F distribution — coming up
Pareto, LaPlace, Rayleigh, Gumbel, Johnson distributions
Etc. . . .
ISYE 6739 — Goldsman 8/5/20 52 / 108
![Page 296: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/296.jpg)
Other Continuous Distributions
Alphabet Soup of Other Distributions
χ2 distribution — coming up when we talk Statistics
t distribution — coming up
F distribution — coming up
Pareto, LaPlace, Rayleigh, Gumbel, Johnson distributions
Etc. . . .
ISYE 6739 — Goldsman 8/5/20 52 / 108
![Page 297: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/297.jpg)
Other Continuous Distributions
Alphabet Soup of Other Distributions
χ2 distribution — coming up when we talk Statistics
t distribution — coming up
F distribution — coming up
Pareto, LaPlace, Rayleigh, Gumbel, Johnson distributions
Etc. . . .
ISYE 6739 — Goldsman 8/5/20 52 / 108
![Page 298: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/298.jpg)
Other Continuous Distributions
Alphabet Soup of Other Distributions
χ2 distribution — coming up when we talk Statistics
t distribution — coming up
F distribution — coming up
Pareto, LaPlace, Rayleigh, Gumbel, Johnson distributions
Etc. . . .
ISYE 6739 — Goldsman 8/5/20 52 / 108
![Page 299: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/299.jpg)
Normal Distribution: Basics
1 Bernoulli and Binomial Distributions
2 Hypergeometric Distribution3 Geometric and Negative Binomial Distributions
4 Poisson Distribution5 Uniform, Exponential, and Friends6 Other Continuous Distributions7 Normal Distribution: Basics8 Standard Normal Distribution9 Sample Mean of Normals
10 The Central Limit Theorem + Proof
11 Central Limit Theorem Examples
12 Extensions — Multivariate Normal Distribution13 Extensions — Lognormal Distribution
14 Computer Stuff
ISYE 6739 — Goldsman 8/5/20 53 / 108
![Page 300: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/300.jpg)
Normal Distribution: Basics
Lesson 4.7 — Normal Distribution: Basics
The Normal Distribution is so important that we’re giving it an entire section.
Definition: X ∼ Nor(µ, σ2) if it has pdf
f(x) =1√
2πσ2exp
[−(x− µ)2
2σ2
], ∀x ∈ R.
Remark: The Normal distribution is also called the Gaussian distribution.
Examples: Heights, weights, SAT scores, crop yields, and averages ofthings tend to be normal.
ISYE 6739 — Goldsman 8/5/20 54 / 108
![Page 301: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/301.jpg)
Normal Distribution: Basics
Lesson 4.7 — Normal Distribution: Basics
The Normal Distribution is so important that we’re giving it an entire section.
Definition: X ∼ Nor(µ, σ2) if it has pdf
f(x) =1√
2πσ2exp
[−(x− µ)2
2σ2
], ∀x ∈ R.
Remark: The Normal distribution is also called the Gaussian distribution.
Examples: Heights, weights, SAT scores, crop yields, and averages ofthings tend to be normal.
ISYE 6739 — Goldsman 8/5/20 54 / 108
![Page 302: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/302.jpg)
Normal Distribution: Basics
Lesson 4.7 — Normal Distribution: Basics
The Normal Distribution is so important that we’re giving it an entire section.
Definition: X ∼ Nor(µ, σ2) if it has pdf
f(x) =1√
2πσ2exp
[−(x− µ)2
2σ2
], ∀x ∈ R.
Remark: The Normal distribution is also called the Gaussian distribution.
Examples: Heights, weights, SAT scores, crop yields, and averages ofthings tend to be normal.
ISYE 6739 — Goldsman 8/5/20 54 / 108
![Page 303: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/303.jpg)
Normal Distribution: Basics
Lesson 4.7 — Normal Distribution: Basics
The Normal Distribution is so important that we’re giving it an entire section.
Definition: X ∼ Nor(µ, σ2) if it has pdf
f(x) =1√
2πσ2exp
[−(x− µ)2
2σ2
], ∀x ∈ R.
Remark: The Normal distribution is also called the Gaussian distribution.
Examples: Heights, weights, SAT scores, crop yields, and averages ofthings tend to be normal.
ISYE 6739 — Goldsman 8/5/20 54 / 108
![Page 304: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/304.jpg)
Normal Distribution: Basics
Lesson 4.7 — Normal Distribution: Basics
The Normal Distribution is so important that we’re giving it an entire section.
Definition: X ∼ Nor(µ, σ2) if it has pdf
f(x) =1√
2πσ2exp
[−(x− µ)2
2σ2
], ∀x ∈ R.
Remark: The Normal distribution is also called the Gaussian distribution.
Examples: Heights, weights, SAT scores, crop yields, and averages ofthings tend to be normal.
ISYE 6739 — Goldsman 8/5/20 54 / 108
![Page 305: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/305.jpg)
Normal Distribution: Basics
The pdf f(x) is “bell-shaped” and symmetric around x = µ, with tails fallingoff quickly as you move away from µ.
Small σ2 corresponds to a “tall, skinny” bell curve; large σ2 gives a “short,fat” bell curve.
ISYE 6739 — Goldsman 8/5/20 55 / 108
![Page 306: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/306.jpg)
Normal Distribution: Basics
The pdf f(x) is “bell-shaped” and symmetric around x = µ, with tails fallingoff quickly as you move away from µ.
Small σ2 corresponds to a “tall, skinny” bell curve; large σ2 gives a “short,fat” bell curve.
ISYE 6739 — Goldsman 8/5/20 55 / 108
![Page 307: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/307.jpg)
Normal Distribution: Basics
The pdf f(x) is “bell-shaped” and symmetric around x = µ, with tails fallingoff quickly as you move away from µ.
Small σ2 corresponds to a “tall, skinny” bell curve; large σ2 gives a “short,fat” bell curve.
ISYE 6739 — Goldsman 8/5/20 55 / 108
![Page 308: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/308.jpg)
Normal Distribution: Basics
Fun Fact (1):∫R f(x) dx = 1.
Proof: Transform to polar coordinates. Good luck. 2
Fun Fact (2): The cdf is
F (x) =
∫ x
−∞
1√2πσ2
exp
[−(t− µ)2
2σ2
]dt = ??
Remark: No closed-form solution for this. Stay tuned.
Fun Facts (3) and (4): E[X] = µ and Var(X) = σ2.
Proof: Integration by parts or mgf (below).
ISYE 6739 — Goldsman 8/5/20 56 / 108
![Page 309: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/309.jpg)
Normal Distribution: Basics
Fun Fact (1):∫R f(x) dx = 1.
Proof: Transform to polar coordinates. Good luck. 2
Fun Fact (2): The cdf is
F (x) =
∫ x
−∞
1√2πσ2
exp
[−(t− µ)2
2σ2
]dt = ??
Remark: No closed-form solution for this. Stay tuned.
Fun Facts (3) and (4): E[X] = µ and Var(X) = σ2.
Proof: Integration by parts or mgf (below).
ISYE 6739 — Goldsman 8/5/20 56 / 108
![Page 310: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/310.jpg)
Normal Distribution: Basics
Fun Fact (1):∫R f(x) dx = 1.
Proof: Transform to polar coordinates. Good luck. 2
Fun Fact (2): The cdf is
F (x) =
∫ x
−∞
1√2πσ2
exp
[−(t− µ)2
2σ2
]dt = ??
Remark: No closed-form solution for this. Stay tuned.
Fun Facts (3) and (4): E[X] = µ and Var(X) = σ2.
Proof: Integration by parts or mgf (below).
ISYE 6739 — Goldsman 8/5/20 56 / 108
![Page 311: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/311.jpg)
Normal Distribution: Basics
Fun Fact (1):∫R f(x) dx = 1.
Proof: Transform to polar coordinates. Good luck. 2
Fun Fact (2): The cdf is
F (x) =
∫ x
−∞
1√2πσ2
exp
[−(t− µ)2
2σ2
]dt = ??
Remark: No closed-form solution for this. Stay tuned.
Fun Facts (3) and (4): E[X] = µ and Var(X) = σ2.
Proof: Integration by parts or mgf (below).
ISYE 6739 — Goldsman 8/5/20 56 / 108
![Page 312: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/312.jpg)
Normal Distribution: Basics
Fun Fact (1):∫R f(x) dx = 1.
Proof: Transform to polar coordinates. Good luck. 2
Fun Fact (2): The cdf is
F (x) =
∫ x
−∞
1√2πσ2
exp
[−(t− µ)2
2σ2
]dt = ??
Remark: No closed-form solution for this. Stay tuned.
Fun Facts (3) and (4): E[X] = µ and Var(X) = σ2.
Proof: Integration by parts or mgf (below).
ISYE 6739 — Goldsman 8/5/20 56 / 108
![Page 313: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/313.jpg)
Normal Distribution: Basics
Fun Fact (1):∫R f(x) dx = 1.
Proof: Transform to polar coordinates. Good luck. 2
Fun Fact (2): The cdf is
F (x) =
∫ x
−∞
1√2πσ2
exp
[−(t− µ)2
2σ2
]dt = ??
Remark: No closed-form solution for this. Stay tuned.
Fun Facts (3) and (4): E[X] = µ and Var(X) = σ2.
Proof: Integration by parts or mgf (below).
ISYE 6739 — Goldsman 8/5/20 56 / 108
![Page 314: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/314.jpg)
Normal Distribution: Basics
Fun Fact (5): If X is any Normal RV, then
P (µ− σ < X < µ+ σ) = 0.6827
P (µ− 2σ < X < µ+ 2σ) = 0.9545
P (µ− 3σ < X < µ+ 3σ) = 0.9973.
So almost all of the probability is contained within 3 standard deviations ofthe mean. (This is sort of what Toyota is referring to when it brags about“six-sigma” quality.)
Fun Fact (6): The mgf is MX(t) = exp(µt+ 12σ
2t2).
Proof: Calculus (or look it up in a table of integrals).
ISYE 6739 — Goldsman 8/5/20 57 / 108
![Page 315: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/315.jpg)
Normal Distribution: Basics
Fun Fact (5): If X is any Normal RV, then
P (µ− σ < X < µ+ σ) = 0.6827
P (µ− 2σ < X < µ+ 2σ) = 0.9545
P (µ− 3σ < X < µ+ 3σ) = 0.9973.
So almost all of the probability is contained within 3 standard deviations ofthe mean. (This is sort of what Toyota is referring to when it brags about“six-sigma” quality.)
Fun Fact (6): The mgf is MX(t) = exp(µt+ 12σ
2t2).
Proof: Calculus (or look it up in a table of integrals).
ISYE 6739 — Goldsman 8/5/20 57 / 108
![Page 316: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/316.jpg)
Normal Distribution: Basics
Fun Fact (5): If X is any Normal RV, then
P (µ− σ < X < µ+ σ) = 0.6827
P (µ− 2σ < X < µ+ 2σ) = 0.9545
P (µ− 3σ < X < µ+ 3σ) = 0.9973.
So almost all of the probability is contained within 3 standard deviations ofthe mean. (This is sort of what Toyota is referring to when it brags about“six-sigma” quality.)
Fun Fact (6): The mgf is MX(t) = exp(µt+ 12σ
2t2).
Proof: Calculus (or look it up in a table of integrals).
ISYE 6739 — Goldsman 8/5/20 57 / 108
![Page 317: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/317.jpg)
Normal Distribution: Basics
Fun Fact (5): If X is any Normal RV, then
P (µ− σ < X < µ+ σ) = 0.6827
P (µ− 2σ < X < µ+ 2σ) = 0.9545
P (µ− 3σ < X < µ+ 3σ) = 0.9973.
So almost all of the probability is contained within 3 standard deviations ofthe mean. (This is sort of what Toyota is referring to when it brags about“six-sigma” quality.)
Fun Fact (6): The mgf is MX(t) = exp(µt+ 12σ
2t2).
Proof: Calculus (or look it up in a table of integrals).
ISYE 6739 — Goldsman 8/5/20 57 / 108
![Page 318: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/318.jpg)
Normal Distribution: Basics
Fun Fact (5): If X is any Normal RV, then
P (µ− σ < X < µ+ σ) = 0.6827
P (µ− 2σ < X < µ+ 2σ) = 0.9545
P (µ− 3σ < X < µ+ 3σ) = 0.9973.
So almost all of the probability is contained within 3 standard deviations ofthe mean. (This is sort of what Toyota is referring to when it brags about“six-sigma” quality.)
Fun Fact (6): The mgf is MX(t) = exp(µt+ 12σ
2t2).
Proof: Calculus (or look it up in a table of integrals).
ISYE 6739 — Goldsman 8/5/20 57 / 108
![Page 319: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/319.jpg)
Normal Distribution: Basics
Fun Fact (5): If X is any Normal RV, then
P (µ− σ < X < µ+ σ) = 0.6827
P (µ− 2σ < X < µ+ 2σ) = 0.9545
P (µ− 3σ < X < µ+ 3σ) = 0.9973.
So almost all of the probability is contained within 3 standard deviations ofthe mean. (This is sort of what Toyota is referring to when it brags about“six-sigma” quality.)
Fun Fact (6): The mgf is MX(t) = exp(µt+ 12σ
2t2).
Proof: Calculus (or look it up in a table of integrals).
ISYE 6739 — Goldsman 8/5/20 57 / 108
![Page 320: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/320.jpg)
Normal Distribution: Basics
Theorem (Additive Property of Normals): If X1, . . . , Xn are independentwith Xi ∼ Nor(µi, σ
2i ), i = 1, . . . , n, then
Y ≡n∑i=1
aiXi + b ∼ Nor( n∑i=1
aiµi + b,
n∑i=1
a2iσ2i
).
So a linear combination of independent normals is itself normal.
ISYE 6739 — Goldsman 8/5/20 58 / 108
![Page 321: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/321.jpg)
Normal Distribution: Basics
Theorem (Additive Property of Normals): If X1, . . . , Xn are independentwith Xi ∼ Nor(µi, σ
2i ), i = 1, . . . , n, then
Y ≡n∑i=1
aiXi + b ∼ Nor( n∑i=1
aiµi + b,
n∑i=1
a2iσ2i
).
So a linear combination of independent normals is itself normal.
ISYE 6739 — Goldsman 8/5/20 58 / 108
![Page 322: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/322.jpg)
Normal Distribution: Basics
Theorem (Additive Property of Normals): If X1, . . . , Xn are independentwith Xi ∼ Nor(µi, σ
2i ), i = 1, . . . , n, then
Y ≡n∑i=1
aiXi + b ∼ Nor( n∑i=1
aiµi + b,
n∑i=1
a2iσ2i
).
So a linear combination of independent normals is itself normal.
ISYE 6739 — Goldsman 8/5/20 58 / 108
![Page 323: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/323.jpg)
Normal Distribution: Basics
Proof: Since Y is a linear function,
MY (t) = M∑i aiXi+b(t)
= etbM∑i aiXi
(t)
= etbn∏i=1
MaiXi(t) (Xi’s independent)
= etbn∏i=1
MXi(ait) (mgf of linear function)
= etbn∏i=1
exp[µi(ait) +
1
2σ2i (ait)
2]
(normal mgf)
= exp
[( n∑i=1
µiai + b)t+
1
2
( n∑i=1
a2iσ2i
)t2],
and we are done by mgf uniqueness. 2
ISYE 6739 — Goldsman 8/5/20 59 / 108
![Page 324: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/324.jpg)
Normal Distribution: Basics
Proof: Since Y is a linear function,
MY (t) = M∑i aiXi+b(t) = etbM∑
i aiXi(t)
= etbn∏i=1
MaiXi(t) (Xi’s independent)
= etbn∏i=1
MXi(ait) (mgf of linear function)
= etbn∏i=1
exp[µi(ait) +
1
2σ2i (ait)
2]
(normal mgf)
= exp
[( n∑i=1
µiai + b)t+
1
2
( n∑i=1
a2iσ2i
)t2],
and we are done by mgf uniqueness. 2
ISYE 6739 — Goldsman 8/5/20 59 / 108
![Page 325: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/325.jpg)
Normal Distribution: Basics
Proof: Since Y is a linear function,
MY (t) = M∑i aiXi+b(t) = etbM∑
i aiXi(t)
= etbn∏i=1
MaiXi(t) (Xi’s independent)
= etbn∏i=1
MXi(ait) (mgf of linear function)
= etbn∏i=1
exp[µi(ait) +
1
2σ2i (ait)
2]
(normal mgf)
= exp
[( n∑i=1
µiai + b)t+
1
2
( n∑i=1
a2iσ2i
)t2],
and we are done by mgf uniqueness. 2
ISYE 6739 — Goldsman 8/5/20 59 / 108
![Page 326: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/326.jpg)
Normal Distribution: Basics
Proof: Since Y is a linear function,
MY (t) = M∑i aiXi+b(t) = etbM∑
i aiXi(t)
= etbn∏i=1
MaiXi(t) (Xi’s independent)
= etbn∏i=1
MXi(ait) (mgf of linear function)
= etbn∏i=1
exp[µi(ait) +
1
2σ2i (ait)
2]
(normal mgf)
= exp
[( n∑i=1
µiai + b)t+
1
2
( n∑i=1
a2iσ2i
)t2],
and we are done by mgf uniqueness. 2
ISYE 6739 — Goldsman 8/5/20 59 / 108
![Page 327: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/327.jpg)
Normal Distribution: Basics
Proof: Since Y is a linear function,
MY (t) = M∑i aiXi+b(t) = etbM∑
i aiXi(t)
= etbn∏i=1
MaiXi(t) (Xi’s independent)
= etbn∏i=1
MXi(ait) (mgf of linear function)
= etbn∏i=1
exp[µi(ait) +
1
2σ2i (ait)
2]
(normal mgf)
= exp
[( n∑i=1
µiai + b)t+
1
2
( n∑i=1
a2iσ2i
)t2],
and we are done by mgf uniqueness. 2
ISYE 6739 — Goldsman 8/5/20 59 / 108
![Page 328: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/328.jpg)
Normal Distribution: Basics
Proof: Since Y is a linear function,
MY (t) = M∑i aiXi+b(t) = etbM∑
i aiXi(t)
= etbn∏i=1
MaiXi(t) (Xi’s independent)
= etbn∏i=1
MXi(ait) (mgf of linear function)
= etbn∏i=1
exp[µi(ait) +
1
2σ2i (ait)
2]
(normal mgf)
= exp
[( n∑i=1
µiai + b)t+
1
2
( n∑i=1
a2iσ2i
)t2],
and we are done by mgf uniqueness. 2
ISYE 6739 — Goldsman 8/5/20 59 / 108
![Page 329: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/329.jpg)
Normal Distribution: Basics
Proof: Since Y is a linear function,
MY (t) = M∑i aiXi+b(t) = etbM∑
i aiXi(t)
= etbn∏i=1
MaiXi(t) (Xi’s independent)
= etbn∏i=1
MXi(ait) (mgf of linear function)
= etbn∏i=1
exp[µi(ait) +
1
2σ2i (ait)
2]
(normal mgf)
= exp
[( n∑i=1
µiai + b)t+
1
2
( n∑i=1
a2iσ2i
)t2],
and we are done by mgf uniqueness. 2
ISYE 6739 — Goldsman 8/5/20 59 / 108
![Page 330: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/330.jpg)
Normal Distribution: Basics
Remark: A normal distribution is completely characterized by its mean andvariance.
By the above, we know that a linear combination of independent normals isstill normal. Therefore, when we add up independent normals, all we have todo is figure out the mean and variance — the normality of the sum comes forfree.
Example: X ∼ Nor(3, 4), Y ∼ Nor(4, 6), and X,Y are independent. Findthe distribution of 2X − 3Y .
Solution: This is normal with
E[2X − 3Y ] = 2E[X]− 3E[Y ] = 2(3)− 3(4) = −6
andVar(2X − 3Y ) = 4Var(X) + 9Var(Y ) = 70.
Thus, 2X − 3Y ∼ Nor(−6, 70). 2
ISYE 6739 — Goldsman 8/5/20 60 / 108
![Page 331: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/331.jpg)
Normal Distribution: Basics
Remark: A normal distribution is completely characterized by its mean andvariance.
By the above, we know that a linear combination of independent normals isstill normal.
Therefore, when we add up independent normals, all we have todo is figure out the mean and variance — the normality of the sum comes forfree.
Example: X ∼ Nor(3, 4), Y ∼ Nor(4, 6), and X,Y are independent. Findthe distribution of 2X − 3Y .
Solution: This is normal with
E[2X − 3Y ] = 2E[X]− 3E[Y ] = 2(3)− 3(4) = −6
andVar(2X − 3Y ) = 4Var(X) + 9Var(Y ) = 70.
Thus, 2X − 3Y ∼ Nor(−6, 70). 2
ISYE 6739 — Goldsman 8/5/20 60 / 108
![Page 332: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/332.jpg)
Normal Distribution: Basics
Remark: A normal distribution is completely characterized by its mean andvariance.
By the above, we know that a linear combination of independent normals isstill normal. Therefore, when we add up independent normals, all we have todo is figure out the mean and variance — the normality of the sum comes forfree.
Example: X ∼ Nor(3, 4), Y ∼ Nor(4, 6), and X,Y are independent. Findthe distribution of 2X − 3Y .
Solution: This is normal with
E[2X − 3Y ] = 2E[X]− 3E[Y ] = 2(3)− 3(4) = −6
andVar(2X − 3Y ) = 4Var(X) + 9Var(Y ) = 70.
Thus, 2X − 3Y ∼ Nor(−6, 70). 2
ISYE 6739 — Goldsman 8/5/20 60 / 108
![Page 333: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/333.jpg)
Normal Distribution: Basics
Remark: A normal distribution is completely characterized by its mean andvariance.
By the above, we know that a linear combination of independent normals isstill normal. Therefore, when we add up independent normals, all we have todo is figure out the mean and variance — the normality of the sum comes forfree.
Example: X ∼ Nor(3, 4), Y ∼ Nor(4, 6), and X,Y are independent. Findthe distribution of 2X − 3Y .
Solution: This is normal with
E[2X − 3Y ] = 2E[X]− 3E[Y ] = 2(3)− 3(4) = −6
andVar(2X − 3Y ) = 4Var(X) + 9Var(Y ) = 70.
Thus, 2X − 3Y ∼ Nor(−6, 70). 2
ISYE 6739 — Goldsman 8/5/20 60 / 108
![Page 334: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/334.jpg)
Normal Distribution: Basics
Remark: A normal distribution is completely characterized by its mean andvariance.
By the above, we know that a linear combination of independent normals isstill normal. Therefore, when we add up independent normals, all we have todo is figure out the mean and variance — the normality of the sum comes forfree.
Example: X ∼ Nor(3, 4), Y ∼ Nor(4, 6), and X,Y are independent. Findthe distribution of 2X − 3Y .
Solution: This is normal with
E[2X − 3Y ] = 2E[X]− 3E[Y ] = 2(3)− 3(4) = −6
andVar(2X − 3Y ) = 4Var(X) + 9Var(Y ) = 70.
Thus, 2X − 3Y ∼ Nor(−6, 70). 2
ISYE 6739 — Goldsman 8/5/20 60 / 108
![Page 335: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/335.jpg)
Normal Distribution: Basics
Remark: A normal distribution is completely characterized by its mean andvariance.
By the above, we know that a linear combination of independent normals isstill normal. Therefore, when we add up independent normals, all we have todo is figure out the mean and variance — the normality of the sum comes forfree.
Example: X ∼ Nor(3, 4), Y ∼ Nor(4, 6), and X,Y are independent. Findthe distribution of 2X − 3Y .
Solution: This is normal with
E[2X − 3Y ] = 2E[X]− 3E[Y ] = 2(3)− 3(4) = −6
andVar(2X − 3Y ) = 4Var(X) + 9Var(Y ) = 70.
Thus, 2X − 3Y ∼ Nor(−6, 70). 2
ISYE 6739 — Goldsman 8/5/20 60 / 108
![Page 336: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/336.jpg)
Normal Distribution: Basics
Remark: A normal distribution is completely characterized by its mean andvariance.
By the above, we know that a linear combination of independent normals isstill normal. Therefore, when we add up independent normals, all we have todo is figure out the mean and variance — the normality of the sum comes forfree.
Example: X ∼ Nor(3, 4), Y ∼ Nor(4, 6), and X,Y are independent. Findthe distribution of 2X − 3Y .
Solution: This is normal with
E[2X − 3Y ] = 2E[X]− 3E[Y ] = 2(3)− 3(4) = −6
andVar(2X − 3Y ) = 4Var(X) + 9Var(Y ) = 70.
Thus, 2X − 3Y ∼ Nor(−6, 70). 2
ISYE 6739 — Goldsman 8/5/20 60 / 108
![Page 337: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/337.jpg)
Normal Distribution: Basics
Corollary (of Additive Property Theorem):
X ∼ Nor(µ, σ2) ⇒ aX + b ∼ Nor(aµ+ b, a2σ2).
Proof: Immediate from the Additive Property after noting thatE[aX + b] = aµ+ b and Var(aX + b) = a2σ2. 2
Corollary (of Corollary):
X ∼ Nor(µ, σ2) ⇒ Z ≡ X − µσ
∼ Nor(0, 1).
Proof: Use above Corollary with a = 1/σ and b = −µ/σ. 2
The manipulation described in this corollary is referred to asstandardization.
ISYE 6739 — Goldsman 8/5/20 61 / 108
![Page 338: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/338.jpg)
Normal Distribution: Basics
Corollary (of Additive Property Theorem):
X ∼ Nor(µ, σ2) ⇒ aX + b ∼ Nor(aµ+ b, a2σ2).
Proof: Immediate from the Additive Property after noting thatE[aX + b] = aµ+ b and Var(aX + b) = a2σ2. 2
Corollary (of Corollary):
X ∼ Nor(µ, σ2) ⇒ Z ≡ X − µσ
∼ Nor(0, 1).
Proof: Use above Corollary with a = 1/σ and b = −µ/σ. 2
The manipulation described in this corollary is referred to asstandardization.
ISYE 6739 — Goldsman 8/5/20 61 / 108
![Page 339: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/339.jpg)
Normal Distribution: Basics
Corollary (of Additive Property Theorem):
X ∼ Nor(µ, σ2) ⇒ aX + b ∼ Nor(aµ+ b, a2σ2).
Proof: Immediate from the Additive Property after noting thatE[aX + b] = aµ+ b and Var(aX + b) = a2σ2. 2
Corollary (of Corollary):
X ∼ Nor(µ, σ2) ⇒ Z ≡ X − µσ
∼ Nor(0, 1).
Proof: Use above Corollary with a = 1/σ and b = −µ/σ. 2
The manipulation described in this corollary is referred to asstandardization.
ISYE 6739 — Goldsman 8/5/20 61 / 108
![Page 340: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/340.jpg)
Normal Distribution: Basics
Corollary (of Additive Property Theorem):
X ∼ Nor(µ, σ2) ⇒ aX + b ∼ Nor(aµ+ b, a2σ2).
Proof: Immediate from the Additive Property after noting thatE[aX + b] = aµ+ b and Var(aX + b) = a2σ2. 2
Corollary (of Corollary):
X ∼ Nor(µ, σ2) ⇒ Z ≡ X − µσ
∼ Nor(0, 1).
Proof: Use above Corollary with a = 1/σ and b = −µ/σ. 2
The manipulation described in this corollary is referred to asstandardization.
ISYE 6739 — Goldsman 8/5/20 61 / 108
![Page 341: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/341.jpg)
Normal Distribution: Basics
Corollary (of Additive Property Theorem):
X ∼ Nor(µ, σ2) ⇒ aX + b ∼ Nor(aµ+ b, a2σ2).
Proof: Immediate from the Additive Property after noting thatE[aX + b] = aµ+ b and Var(aX + b) = a2σ2. 2
Corollary (of Corollary):
X ∼ Nor(µ, σ2) ⇒ Z ≡ X − µσ
∼ Nor(0, 1).
Proof: Use above Corollary with a = 1/σ and b = −µ/σ. 2
The manipulation described in this corollary is referred to asstandardization.
ISYE 6739 — Goldsman 8/5/20 61 / 108
![Page 342: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/342.jpg)
Normal Distribution: Basics
Corollary (of Additive Property Theorem):
X ∼ Nor(µ, σ2) ⇒ aX + b ∼ Nor(aµ+ b, a2σ2).
Proof: Immediate from the Additive Property after noting thatE[aX + b] = aµ+ b and Var(aX + b) = a2σ2. 2
Corollary (of Corollary):
X ∼ Nor(µ, σ2) ⇒ Z ≡ X − µσ
∼ Nor(0, 1).
Proof: Use above Corollary with a = 1/σ and b = −µ/σ. 2
The manipulation described in this corollary is referred to asstandardization.
ISYE 6739 — Goldsman 8/5/20 61 / 108
![Page 343: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/343.jpg)
Normal Distribution: Basics
Corollary (of Additive Property Theorem):
X ∼ Nor(µ, σ2) ⇒ aX + b ∼ Nor(aµ+ b, a2σ2).
Proof: Immediate from the Additive Property after noting thatE[aX + b] = aµ+ b and Var(aX + b) = a2σ2. 2
Corollary (of Corollary):
X ∼ Nor(µ, σ2) ⇒ Z ≡ X − µσ
∼ Nor(0, 1).
Proof: Use above Corollary with a = 1/σ and b = −µ/σ. 2
The manipulation described in this corollary is referred to asstandardization.
ISYE 6739 — Goldsman 8/5/20 61 / 108
![Page 344: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/344.jpg)
Standard Normal Distribution
1 Bernoulli and Binomial Distributions
2 Hypergeometric Distribution3 Geometric and Negative Binomial Distributions
4 Poisson Distribution5 Uniform, Exponential, and Friends6 Other Continuous Distributions7 Normal Distribution: Basics8 Standard Normal Distribution9 Sample Mean of Normals
10 The Central Limit Theorem + Proof
11 Central Limit Theorem Examples
12 Extensions — Multivariate Normal Distribution13 Extensions — Lognormal Distribution
14 Computer Stuff
ISYE 6739 — Goldsman 8/5/20 62 / 108
![Page 345: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/345.jpg)
Standard Normal Distribution
Lesson 4.8 — Standard Normal Distribution
Definition: The Nor(0, 1) is called the standard normal distribution,and is often denoted by Z.
The Nor(0, 1) is nice because there are tables available for its cdf.
You can standardize any normal RV X into a standard normal by applying thetransformation Z = (X − µ)/σ. Then you can use the cdf tables.
The pdf of the Nor(0, 1) is
φ(z) ≡ 1√2πe−z
2/2, z ∈ R.
The cdf isΦ(z) ≡
∫ z
−∞φ(t) dt, z ∈ R.
ISYE 6739 — Goldsman 8/5/20 63 / 108
![Page 346: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/346.jpg)
Standard Normal Distribution
Lesson 4.8 — Standard Normal Distribution
Definition: The Nor(0, 1) is called the standard normal distribution,and is often denoted by Z.
The Nor(0, 1) is nice because there are tables available for its cdf.
You can standardize any normal RV X into a standard normal by applying thetransformation Z = (X − µ)/σ. Then you can use the cdf tables.
The pdf of the Nor(0, 1) is
φ(z) ≡ 1√2πe−z
2/2, z ∈ R.
The cdf isΦ(z) ≡
∫ z
−∞φ(t) dt, z ∈ R.
ISYE 6739 — Goldsman 8/5/20 63 / 108
![Page 347: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/347.jpg)
Standard Normal Distribution
Lesson 4.8 — Standard Normal Distribution
Definition: The Nor(0, 1) is called the standard normal distribution,and is often denoted by Z.
The Nor(0, 1) is nice because there are tables available for its cdf.
You can standardize any normal RV X into a standard normal by applying thetransformation Z = (X − µ)/σ. Then you can use the cdf tables.
The pdf of the Nor(0, 1) is
φ(z) ≡ 1√2πe−z
2/2, z ∈ R.
The cdf isΦ(z) ≡
∫ z
−∞φ(t) dt, z ∈ R.
ISYE 6739 — Goldsman 8/5/20 63 / 108
![Page 348: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/348.jpg)
Standard Normal Distribution
Lesson 4.8 — Standard Normal Distribution
Definition: The Nor(0, 1) is called the standard normal distribution,and is often denoted by Z.
The Nor(0, 1) is nice because there are tables available for its cdf.
You can standardize any normal RV X into a standard normal by applying thetransformation Z = (X − µ)/σ. Then you can use the cdf tables.
The pdf of the Nor(0, 1) is
φ(z) ≡ 1√2πe−z
2/2, z ∈ R.
The cdf isΦ(z) ≡
∫ z
−∞φ(t) dt, z ∈ R.
ISYE 6739 — Goldsman 8/5/20 63 / 108
![Page 349: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/349.jpg)
Standard Normal Distribution
Lesson 4.8 — Standard Normal Distribution
Definition: The Nor(0, 1) is called the standard normal distribution,and is often denoted by Z.
The Nor(0, 1) is nice because there are tables available for its cdf.
You can standardize any normal RV X into a standard normal by applying thetransformation Z = (X − µ)/σ. Then you can use the cdf tables.
The pdf of the Nor(0, 1) is
φ(z) ≡ 1√2πe−z
2/2, z ∈ R.
The cdf isΦ(z) ≡
∫ z
−∞φ(t) dt, z ∈ R.
ISYE 6739 — Goldsman 8/5/20 63 / 108
![Page 350: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/350.jpg)
Standard Normal Distribution
Lesson 4.8 — Standard Normal Distribution
Definition: The Nor(0, 1) is called the standard normal distribution,and is often denoted by Z.
The Nor(0, 1) is nice because there are tables available for its cdf.
You can standardize any normal RV X into a standard normal by applying thetransformation Z = (X − µ)/σ. Then you can use the cdf tables.
The pdf of the Nor(0, 1) is
φ(z) ≡ 1√2πe−z
2/2, z ∈ R.
The cdf isΦ(z) ≡
∫ z
−∞φ(t) dt, z ∈ R.
ISYE 6739 — Goldsman 8/5/20 63 / 108
![Page 351: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/351.jpg)
Standard Normal Distribution
Remarks: The following results are easy to derive, usually via symmetryarguments.
P (Z ≤ a) = Φ(a)
P (Z ≥ b) = 1− Φ(b)P (a ≤ Z ≤ b) = Φ(b)− Φ(a)
Φ(0) = 1/2Φ(−b) = P (Z ≤ −b) = P (Z ≥ b) = 1− Φ(b)P (−b ≤ Z ≤ b) = Φ(b)− Φ(−b) = 2Φ(b)− 1
Then
P (µ− kσ ≤ X ≤ µ+ kσ) = P (−k ≤ Z ≤ k) = 2Φ(k)− 1.
So the probability that any normal RV is within k standard deviations of itsmean doesn’t depend on the mean or variance.
ISYE 6739 — Goldsman 8/5/20 64 / 108
![Page 352: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/352.jpg)
Standard Normal Distribution
Remarks: The following results are easy to derive, usually via symmetryarguments.
P (Z ≤ a) = Φ(a)P (Z ≥ b) = 1− Φ(b)
P (a ≤ Z ≤ b) = Φ(b)− Φ(a)
Φ(0) = 1/2Φ(−b) = P (Z ≤ −b) = P (Z ≥ b) = 1− Φ(b)P (−b ≤ Z ≤ b) = Φ(b)− Φ(−b) = 2Φ(b)− 1
Then
P (µ− kσ ≤ X ≤ µ+ kσ) = P (−k ≤ Z ≤ k) = 2Φ(k)− 1.
So the probability that any normal RV is within k standard deviations of itsmean doesn’t depend on the mean or variance.
ISYE 6739 — Goldsman 8/5/20 64 / 108
![Page 353: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/353.jpg)
Standard Normal Distribution
Remarks: The following results are easy to derive, usually via symmetryarguments.
P (Z ≤ a) = Φ(a)P (Z ≥ b) = 1− Φ(b)P (a ≤ Z ≤ b) = Φ(b)− Φ(a)
Φ(0) = 1/2Φ(−b) = P (Z ≤ −b) = P (Z ≥ b) = 1− Φ(b)P (−b ≤ Z ≤ b) = Φ(b)− Φ(−b) = 2Φ(b)− 1
Then
P (µ− kσ ≤ X ≤ µ+ kσ) = P (−k ≤ Z ≤ k) = 2Φ(k)− 1.
So the probability that any normal RV is within k standard deviations of itsmean doesn’t depend on the mean or variance.
ISYE 6739 — Goldsman 8/5/20 64 / 108
![Page 354: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/354.jpg)
Standard Normal Distribution
Remarks: The following results are easy to derive, usually via symmetryarguments.
P (Z ≤ a) = Φ(a)P (Z ≥ b) = 1− Φ(b)P (a ≤ Z ≤ b) = Φ(b)− Φ(a)
Φ(0) = 1/2
Φ(−b) = P (Z ≤ −b) = P (Z ≥ b) = 1− Φ(b)P (−b ≤ Z ≤ b) = Φ(b)− Φ(−b) = 2Φ(b)− 1
Then
P (µ− kσ ≤ X ≤ µ+ kσ) = P (−k ≤ Z ≤ k) = 2Φ(k)− 1.
So the probability that any normal RV is within k standard deviations of itsmean doesn’t depend on the mean or variance.
ISYE 6739 — Goldsman 8/5/20 64 / 108
![Page 355: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/355.jpg)
Standard Normal Distribution
Remarks: The following results are easy to derive, usually via symmetryarguments.
P (Z ≤ a) = Φ(a)P (Z ≥ b) = 1− Φ(b)P (a ≤ Z ≤ b) = Φ(b)− Φ(a)
Φ(0) = 1/2Φ(−b) = P (Z ≤ −b) = P (Z ≥ b) = 1− Φ(b)
P (−b ≤ Z ≤ b) = Φ(b)− Φ(−b) = 2Φ(b)− 1
Then
P (µ− kσ ≤ X ≤ µ+ kσ) = P (−k ≤ Z ≤ k) = 2Φ(k)− 1.
So the probability that any normal RV is within k standard deviations of itsmean doesn’t depend on the mean or variance.
ISYE 6739 — Goldsman 8/5/20 64 / 108
![Page 356: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/356.jpg)
Standard Normal Distribution
Remarks: The following results are easy to derive, usually via symmetryarguments.
P (Z ≤ a) = Φ(a)P (Z ≥ b) = 1− Φ(b)P (a ≤ Z ≤ b) = Φ(b)− Φ(a)
Φ(0) = 1/2Φ(−b) = P (Z ≤ −b) = P (Z ≥ b) = 1− Φ(b)P (−b ≤ Z ≤ b) = Φ(b)− Φ(−b) = 2Φ(b)− 1
Then
P (µ− kσ ≤ X ≤ µ+ kσ) = P (−k ≤ Z ≤ k) = 2Φ(k)− 1.
So the probability that any normal RV is within k standard deviations of itsmean doesn’t depend on the mean or variance.
ISYE 6739 — Goldsman 8/5/20 64 / 108
![Page 357: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/357.jpg)
Standard Normal Distribution
Remarks: The following results are easy to derive, usually via symmetryarguments.
P (Z ≤ a) = Φ(a)P (Z ≥ b) = 1− Φ(b)P (a ≤ Z ≤ b) = Φ(b)− Φ(a)
Φ(0) = 1/2Φ(−b) = P (Z ≤ −b) = P (Z ≥ b) = 1− Φ(b)P (−b ≤ Z ≤ b) = Φ(b)− Φ(−b) = 2Φ(b)− 1
Then
P (µ− kσ ≤ X ≤ µ+ kσ)
= P (−k ≤ Z ≤ k) = 2Φ(k)− 1.
So the probability that any normal RV is within k standard deviations of itsmean doesn’t depend on the mean or variance.
ISYE 6739 — Goldsman 8/5/20 64 / 108
![Page 358: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/358.jpg)
Standard Normal Distribution
Remarks: The following results are easy to derive, usually via symmetryarguments.
P (Z ≤ a) = Φ(a)P (Z ≥ b) = 1− Φ(b)P (a ≤ Z ≤ b) = Φ(b)− Φ(a)
Φ(0) = 1/2Φ(−b) = P (Z ≤ −b) = P (Z ≥ b) = 1− Φ(b)P (−b ≤ Z ≤ b) = Φ(b)− Φ(−b) = 2Φ(b)− 1
Then
P (µ− kσ ≤ X ≤ µ+ kσ) = P (−k ≤ Z ≤ k)
= 2Φ(k)− 1.
So the probability that any normal RV is within k standard deviations of itsmean doesn’t depend on the mean or variance.
ISYE 6739 — Goldsman 8/5/20 64 / 108
![Page 359: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/359.jpg)
Standard Normal Distribution
Remarks: The following results are easy to derive, usually via symmetryarguments.
P (Z ≤ a) = Φ(a)P (Z ≥ b) = 1− Φ(b)P (a ≤ Z ≤ b) = Φ(b)− Φ(a)
Φ(0) = 1/2Φ(−b) = P (Z ≤ −b) = P (Z ≥ b) = 1− Φ(b)P (−b ≤ Z ≤ b) = Φ(b)− Φ(−b) = 2Φ(b)− 1
Then
P (µ− kσ ≤ X ≤ µ+ kσ) = P (−k ≤ Z ≤ k) = 2Φ(k)− 1.
So the probability that any normal RV is within k standard deviations of itsmean doesn’t depend on the mean or variance.
ISYE 6739 — Goldsman 8/5/20 64 / 108
![Page 360: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/360.jpg)
Standard Normal Distribution
Remarks: The following results are easy to derive, usually via symmetryarguments.
P (Z ≤ a) = Φ(a)P (Z ≥ b) = 1− Φ(b)P (a ≤ Z ≤ b) = Φ(b)− Φ(a)
Φ(0) = 1/2Φ(−b) = P (Z ≤ −b) = P (Z ≥ b) = 1− Φ(b)P (−b ≤ Z ≤ b) = Φ(b)− Φ(−b) = 2Φ(b)− 1
Then
P (µ− kσ ≤ X ≤ µ+ kσ) = P (−k ≤ Z ≤ k) = 2Φ(k)− 1.
So the probability that any normal RV is within k standard deviations of itsmean doesn’t depend on the mean or variance.
ISYE 6739 — Goldsman 8/5/20 64 / 108
![Page 361: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/361.jpg)
Standard Normal Distribution
Famous Nor(0, 1) table values. You can memorize these.
Or you can usesoftware calls, like NORMDIST in Excel (which calculates the cdf for anynormal distribution.)
z Φ(z) = P (Z ≤ z)0.00 0.5000
1.00 0.8413
1.28 0.8997 ≈ 0.90
1.645 0.9500
1.96 0.9750
2.33 0.9901 ≈ 0.99
3.00 0.9987
4.00 ≈ 1.0000
ISYE 6739 — Goldsman 8/5/20 65 / 108
![Page 362: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/362.jpg)
Standard Normal Distribution
Famous Nor(0, 1) table values. You can memorize these. Or you can usesoftware calls, like NORMDIST in Excel (which calculates the cdf for anynormal distribution.)
z Φ(z) = P (Z ≤ z)0.00 0.5000
1.00 0.8413
1.28 0.8997 ≈ 0.90
1.645 0.9500
1.96 0.9750
2.33 0.9901 ≈ 0.99
3.00 0.9987
4.00 ≈ 1.0000
ISYE 6739 — Goldsman 8/5/20 65 / 108
![Page 363: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/363.jpg)
Standard Normal Distribution
Famous Nor(0, 1) table values. You can memorize these. Or you can usesoftware calls, like NORMDIST in Excel (which calculates the cdf for anynormal distribution.)
z Φ(z) = P (Z ≤ z)
0.00 0.5000
1.00 0.8413
1.28 0.8997 ≈ 0.90
1.645 0.9500
1.96 0.9750
2.33 0.9901 ≈ 0.99
3.00 0.9987
4.00 ≈ 1.0000
ISYE 6739 — Goldsman 8/5/20 65 / 108
![Page 364: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/364.jpg)
Standard Normal Distribution
Famous Nor(0, 1) table values. You can memorize these. Or you can usesoftware calls, like NORMDIST in Excel (which calculates the cdf for anynormal distribution.)
z Φ(z) = P (Z ≤ z)0.00 0.5000
1.00 0.8413
1.28 0.8997 ≈ 0.90
1.645 0.9500
1.96 0.9750
2.33 0.9901 ≈ 0.99
3.00 0.9987
4.00 ≈ 1.0000
ISYE 6739 — Goldsman 8/5/20 65 / 108
![Page 365: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/365.jpg)
Standard Normal Distribution
Famous Nor(0, 1) table values. You can memorize these. Or you can usesoftware calls, like NORMDIST in Excel (which calculates the cdf for anynormal distribution.)
z Φ(z) = P (Z ≤ z)0.00 0.5000
1.00 0.8413
1.28 0.8997 ≈ 0.90
1.645 0.9500
1.96 0.9750
2.33 0.9901 ≈ 0.99
3.00 0.9987
4.00 ≈ 1.0000
ISYE 6739 — Goldsman 8/5/20 65 / 108
![Page 366: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/366.jpg)
Standard Normal Distribution
Famous Nor(0, 1) table values. You can memorize these. Or you can usesoftware calls, like NORMDIST in Excel (which calculates the cdf for anynormal distribution.)
z Φ(z) = P (Z ≤ z)0.00 0.5000
1.00 0.8413
1.28 0.8997 ≈ 0.90
1.645 0.9500
1.96 0.9750
2.33 0.9901 ≈ 0.99
3.00 0.9987
4.00 ≈ 1.0000
ISYE 6739 — Goldsman 8/5/20 65 / 108
![Page 367: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/367.jpg)
Standard Normal Distribution
Famous Nor(0, 1) table values. You can memorize these. Or you can usesoftware calls, like NORMDIST in Excel (which calculates the cdf for anynormal distribution.)
z Φ(z) = P (Z ≤ z)0.00 0.5000
1.00 0.8413
1.28 0.8997 ≈ 0.90
1.645 0.9500
1.96 0.9750
2.33 0.9901 ≈ 0.99
3.00 0.9987
4.00 ≈ 1.0000
ISYE 6739 — Goldsman 8/5/20 65 / 108
![Page 368: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/368.jpg)
Standard Normal Distribution
Famous Nor(0, 1) table values. You can memorize these. Or you can usesoftware calls, like NORMDIST in Excel (which calculates the cdf for anynormal distribution.)
z Φ(z) = P (Z ≤ z)0.00 0.5000
1.00 0.8413
1.28 0.8997 ≈ 0.90
1.645 0.9500
1.96 0.9750
2.33 0.9901 ≈ 0.99
3.00 0.9987
4.00 ≈ 1.0000
ISYE 6739 — Goldsman 8/5/20 65 / 108
![Page 369: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/369.jpg)
Standard Normal Distribution
Famous Nor(0, 1) table values. You can memorize these. Or you can usesoftware calls, like NORMDIST in Excel (which calculates the cdf for anynormal distribution.)
z Φ(z) = P (Z ≤ z)0.00 0.5000
1.00 0.8413
1.28 0.8997 ≈ 0.90
1.645 0.9500
1.96 0.9750
2.33 0.9901 ≈ 0.99
3.00 0.9987
4.00 ≈ 1.0000
ISYE 6739 — Goldsman 8/5/20 65 / 108
![Page 370: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/370.jpg)
Standard Normal Distribution
Famous Nor(0, 1) table values. You can memorize these. Or you can usesoftware calls, like NORMDIST in Excel (which calculates the cdf for anynormal distribution.)
z Φ(z) = P (Z ≤ z)0.00 0.5000
1.00 0.8413
1.28 0.8997 ≈ 0.90
1.645 0.9500
1.96 0.9750
2.33 0.9901 ≈ 0.99
3.00 0.9987
4.00 ≈ 1.0000
ISYE 6739 — Goldsman 8/5/20 65 / 108
![Page 371: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/371.jpg)
Standard Normal Distribution
Famous Nor(0, 1) table values. You can memorize these. Or you can usesoftware calls, like NORMDIST in Excel (which calculates the cdf for anynormal distribution.)
z Φ(z) = P (Z ≤ z)0.00 0.5000
1.00 0.8413
1.28 0.8997 ≈ 0.90
1.645 0.9500
1.96 0.9750
2.33 0.9901 ≈ 0.99
3.00 0.9987
4.00 ≈ 1.0000
ISYE 6739 — Goldsman 8/5/20 65 / 108
![Page 372: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/372.jpg)
Standard Normal Distribution
By the earlier “Fun Facts” and then the discussion on the last two pages, theprobability that any normal RV is within k standard deviations of its mean is
P (µ− kσ ≤ X ≤ µ+ kσ) = 2Φ(k)− 1.
For k = 1, this probability is 2(0.8413)− 1 = 0.6827.
There is a 95% chance that a normal observation will be within 2 s.d.’s of itsmean.
99.7% of all observations are within 3 standard deviations of the mean!
Finally, note that Toyota’s six-sigma corresponds to 2Φ(6)− 1.= 1.0000.
ISYE 6739 — Goldsman 8/5/20 66 / 108
![Page 373: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/373.jpg)
Standard Normal Distribution
By the earlier “Fun Facts” and then the discussion on the last two pages, theprobability that any normal RV is within k standard deviations of its mean is
P (µ− kσ ≤ X ≤ µ+ kσ) = 2Φ(k)− 1.
For k = 1, this probability is 2(0.8413)− 1 = 0.6827.
There is a 95% chance that a normal observation will be within 2 s.d.’s of itsmean.
99.7% of all observations are within 3 standard deviations of the mean!
Finally, note that Toyota’s six-sigma corresponds to 2Φ(6)− 1.= 1.0000.
ISYE 6739 — Goldsman 8/5/20 66 / 108
![Page 374: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/374.jpg)
Standard Normal Distribution
By the earlier “Fun Facts” and then the discussion on the last two pages, theprobability that any normal RV is within k standard deviations of its mean is
P (µ− kσ ≤ X ≤ µ+ kσ) = 2Φ(k)− 1.
For k = 1, this probability is 2(0.8413)− 1 = 0.6827.
There is a 95% chance that a normal observation will be within 2 s.d.’s of itsmean.
99.7% of all observations are within 3 standard deviations of the mean!
Finally, note that Toyota’s six-sigma corresponds to 2Φ(6)− 1.= 1.0000.
ISYE 6739 — Goldsman 8/5/20 66 / 108
![Page 375: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/375.jpg)
Standard Normal Distribution
By the earlier “Fun Facts” and then the discussion on the last two pages, theprobability that any normal RV is within k standard deviations of its mean is
P (µ− kσ ≤ X ≤ µ+ kσ) = 2Φ(k)− 1.
For k = 1, this probability is 2(0.8413)− 1 = 0.6827.
There is a 95% chance that a normal observation will be within 2 s.d.’s of itsmean.
99.7% of all observations are within 3 standard deviations of the mean!
Finally, note that Toyota’s six-sigma corresponds to 2Φ(6)− 1.= 1.0000.
ISYE 6739 — Goldsman 8/5/20 66 / 108
![Page 376: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/376.jpg)
Standard Normal Distribution
By the earlier “Fun Facts” and then the discussion on the last two pages, theprobability that any normal RV is within k standard deviations of its mean is
P (µ− kσ ≤ X ≤ µ+ kσ) = 2Φ(k)− 1.
For k = 1, this probability is 2(0.8413)− 1 = 0.6827.
There is a 95% chance that a normal observation will be within 2 s.d.’s of itsmean.
99.7% of all observations are within 3 standard deviations of the mean!
Finally, note that Toyota’s six-sigma corresponds to 2Φ(6)− 1.= 1.0000.
ISYE 6739 — Goldsman 8/5/20 66 / 108
![Page 377: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/377.jpg)
Standard Normal Distribution
By the earlier “Fun Facts” and then the discussion on the last two pages, theprobability that any normal RV is within k standard deviations of its mean is
P (µ− kσ ≤ X ≤ µ+ kσ) = 2Φ(k)− 1.
For k = 1, this probability is 2(0.8413)− 1 = 0.6827.
There is a 95% chance that a normal observation will be within 2 s.d.’s of itsmean.
99.7% of all observations are within 3 standard deviations of the mean!
Finally, note that Toyota’s six-sigma corresponds to 2Φ(6)− 1.= 1.0000.
ISYE 6739 — Goldsman 8/5/20 66 / 108
![Page 378: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/378.jpg)
Standard Normal Distribution
Famous Inverse Nor(0, 1) table values. Can also use software, such asExcel’s NORMINV function, which actually calculates inverses for any normaldistribution, not just standard normal.
Φ−1(p) is the value of z such that Φ(z) = p. Φ−1(p) is called the pthquantile of Z.
p Φ−1(p)
0.90 1.28
0.95 1.645
0.975 1.96
0.99 2.33
0.995 2.58
ISYE 6739 — Goldsman 8/5/20 67 / 108
![Page 379: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/379.jpg)
Standard Normal Distribution
Famous Inverse Nor(0, 1) table values. Can also use software, such asExcel’s NORMINV function, which actually calculates inverses for any normaldistribution, not just standard normal.
Φ−1(p) is the value of z such that Φ(z) = p. Φ−1(p) is called the pthquantile of Z.
p Φ−1(p)
0.90 1.28
0.95 1.645
0.975 1.96
0.99 2.33
0.995 2.58
ISYE 6739 — Goldsman 8/5/20 67 / 108
![Page 380: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/380.jpg)
Standard Normal Distribution
Famous Inverse Nor(0, 1) table values. Can also use software, such asExcel’s NORMINV function, which actually calculates inverses for any normaldistribution, not just standard normal.
Φ−1(p) is the value of z such that Φ(z) = p. Φ−1(p) is called the pthquantile of Z.
p Φ−1(p)
0.90 1.28
0.95 1.645
0.975 1.96
0.99 2.33
0.995 2.58
ISYE 6739 — Goldsman 8/5/20 67 / 108
![Page 381: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/381.jpg)
Standard Normal Distribution
Famous Inverse Nor(0, 1) table values. Can also use software, such asExcel’s NORMINV function, which actually calculates inverses for any normaldistribution, not just standard normal.
Φ−1(p) is the value of z such that Φ(z) = p. Φ−1(p) is called the pthquantile of Z.
p Φ−1(p)
0.90 1.28
0.95 1.645
0.975 1.96
0.99 2.33
0.995 2.58
ISYE 6739 — Goldsman 8/5/20 67 / 108
![Page 382: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/382.jpg)
Standard Normal Distribution
Famous Inverse Nor(0, 1) table values. Can also use software, such asExcel’s NORMINV function, which actually calculates inverses for any normaldistribution, not just standard normal.
Φ−1(p) is the value of z such that Φ(z) = p. Φ−1(p) is called the pthquantile of Z.
p Φ−1(p)
0.90 1.28
0.95 1.645
0.975 1.96
0.99 2.33
0.995 2.58
ISYE 6739 — Goldsman 8/5/20 67 / 108
![Page 383: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/383.jpg)
Standard Normal Distribution
Famous Inverse Nor(0, 1) table values. Can also use software, such asExcel’s NORMINV function, which actually calculates inverses for any normaldistribution, not just standard normal.
Φ−1(p) is the value of z such that Φ(z) = p. Φ−1(p) is called the pthquantile of Z.
p Φ−1(p)
0.90 1.28
0.95 1.645
0.975 1.96
0.99 2.33
0.995 2.58
ISYE 6739 — Goldsman 8/5/20 67 / 108
![Page 384: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/384.jpg)
Standard Normal Distribution
Famous Inverse Nor(0, 1) table values. Can also use software, such asExcel’s NORMINV function, which actually calculates inverses for any normaldistribution, not just standard normal.
Φ−1(p) is the value of z such that Φ(z) = p. Φ−1(p) is called the pthquantile of Z.
p Φ−1(p)
0.90 1.28
0.95 1.645
0.975 1.96
0.99 2.33
0.995 2.58
ISYE 6739 — Goldsman 8/5/20 67 / 108
![Page 385: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/385.jpg)
Standard Normal Distribution
Famous Inverse Nor(0, 1) table values. Can also use software, such asExcel’s NORMINV function, which actually calculates inverses for any normaldistribution, not just standard normal.
Φ−1(p) is the value of z such that Φ(z) = p. Φ−1(p) is called the pthquantile of Z.
p Φ−1(p)
0.90 1.28
0.95 1.645
0.975 1.96
0.99 2.33
0.995 2.58
ISYE 6739 — Goldsman 8/5/20 67 / 108
![Page 386: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/386.jpg)
Standard Normal Distribution
Example: X ∼ Nor(21, 4). Find P (19 < X < 22.5). Standardizing, weget
P (19 < X < 22.5)
= P
(19− µσ
<X − µσ
<22.5− µ
σ
)= P
(19− 21
2< Z <
22.5− 21
2
)= P (−1 < Z < 0.75)
= Φ(0.75)− Φ(−1)
= Φ(0.75)− [1− Φ(1)]
= 0.7734− [1− 0.8413] = 0.6147. 2
ISYE 6739 — Goldsman 8/5/20 68 / 108
![Page 387: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/387.jpg)
Standard Normal Distribution
Example: X ∼ Nor(21, 4). Find P (19 < X < 22.5). Standardizing, weget
P (19 < X < 22.5)
= P
(19− µσ
<X − µσ
<22.5− µ
σ
)
= P
(19− 21
2< Z <
22.5− 21
2
)= P (−1 < Z < 0.75)
= Φ(0.75)− Φ(−1)
= Φ(0.75)− [1− Φ(1)]
= 0.7734− [1− 0.8413] = 0.6147. 2
ISYE 6739 — Goldsman 8/5/20 68 / 108
![Page 388: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/388.jpg)
Standard Normal Distribution
Example: X ∼ Nor(21, 4). Find P (19 < X < 22.5). Standardizing, weget
P (19 < X < 22.5)
= P
(19− µσ
<X − µσ
<22.5− µ
σ
)= P
(19− 21
2< Z <
22.5− 21
2
)
= P (−1 < Z < 0.75)
= Φ(0.75)− Φ(−1)
= Φ(0.75)− [1− Φ(1)]
= 0.7734− [1− 0.8413] = 0.6147. 2
ISYE 6739 — Goldsman 8/5/20 68 / 108
![Page 389: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/389.jpg)
Standard Normal Distribution
Example: X ∼ Nor(21, 4). Find P (19 < X < 22.5). Standardizing, weget
P (19 < X < 22.5)
= P
(19− µσ
<X − µσ
<22.5− µ
σ
)= P
(19− 21
2< Z <
22.5− 21
2
)= P (−1 < Z < 0.75)
= Φ(0.75)− Φ(−1)
= Φ(0.75)− [1− Φ(1)]
= 0.7734− [1− 0.8413] = 0.6147. 2
ISYE 6739 — Goldsman 8/5/20 68 / 108
![Page 390: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/390.jpg)
Standard Normal Distribution
Example: X ∼ Nor(21, 4). Find P (19 < X < 22.5). Standardizing, weget
P (19 < X < 22.5)
= P
(19− µσ
<X − µσ
<22.5− µ
σ
)= P
(19− 21
2< Z <
22.5− 21
2
)= P (−1 < Z < 0.75)
= Φ(0.75)− Φ(−1)
= Φ(0.75)− [1− Φ(1)]
= 0.7734− [1− 0.8413] = 0.6147. 2
ISYE 6739 — Goldsman 8/5/20 68 / 108
![Page 391: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/391.jpg)
Standard Normal Distribution
Example: X ∼ Nor(21, 4). Find P (19 < X < 22.5). Standardizing, weget
P (19 < X < 22.5)
= P
(19− µσ
<X − µσ
<22.5− µ
σ
)= P
(19− 21
2< Z <
22.5− 21
2
)= P (−1 < Z < 0.75)
= Φ(0.75)− Φ(−1)
= Φ(0.75)− [1− Φ(1)]
= 0.7734− [1− 0.8413] = 0.6147. 2
ISYE 6739 — Goldsman 8/5/20 68 / 108
![Page 392: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/392.jpg)
Standard Normal Distribution
Example: X ∼ Nor(21, 4). Find P (19 < X < 22.5). Standardizing, weget
P (19 < X < 22.5)
= P
(19− µσ
<X − µσ
<22.5− µ
σ
)= P
(19− 21
2< Z <
22.5− 21
2
)= P (−1 < Z < 0.75)
= Φ(0.75)− Φ(−1)
= Φ(0.75)− [1− Φ(1)]
= 0.7734− [1− 0.8413] = 0.6147. 2
ISYE 6739 — Goldsman 8/5/20 68 / 108
![Page 393: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/393.jpg)
Standard Normal Distribution
Example: Suppose that heights of men are M ∼ Nor(68, 4) and heights ofwomen are W ∼ Nor(65, 1).
Select a man and woman independently at random.
Find the probability that the woman is taller than the man.
Answer: Note that
W −M ∼ Nor(E[W −M ],Var(W −M))
∼ Nor(65− 68, 1 + 4) ∼ Nor(−3, 5).
Then
P (W > M) = P (W −M > 0)
= P
(Z >
0 + 3√5
)= 1− Φ(3/
√5)
.= 1− 0.910 = 0.090. 2
ISYE 6739 — Goldsman 8/5/20 69 / 108
![Page 394: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/394.jpg)
Standard Normal Distribution
Example: Suppose that heights of men are M ∼ Nor(68, 4) and heights ofwomen are W ∼ Nor(65, 1).
Select a man and woman independently at random.
Find the probability that the woman is taller than the man.
Answer: Note that
W −M ∼ Nor(E[W −M ],Var(W −M))
∼ Nor(65− 68, 1 + 4) ∼ Nor(−3, 5).
Then
P (W > M) = P (W −M > 0)
= P
(Z >
0 + 3√5
)= 1− Φ(3/
√5)
.= 1− 0.910 = 0.090. 2
ISYE 6739 — Goldsman 8/5/20 69 / 108
![Page 395: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/395.jpg)
Standard Normal Distribution
Example: Suppose that heights of men are M ∼ Nor(68, 4) and heights ofwomen are W ∼ Nor(65, 1).
Select a man and woman independently at random.
Find the probability that the woman is taller than the man.
Answer: Note that
W −M ∼ Nor(E[W −M ],Var(W −M))
∼ Nor(65− 68, 1 + 4) ∼ Nor(−3, 5).
Then
P (W > M) = P (W −M > 0)
= P
(Z >
0 + 3√5
)= 1− Φ(3/
√5)
.= 1− 0.910 = 0.090. 2
ISYE 6739 — Goldsman 8/5/20 69 / 108
![Page 396: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/396.jpg)
Standard Normal Distribution
Example: Suppose that heights of men are M ∼ Nor(68, 4) and heights ofwomen are W ∼ Nor(65, 1).
Select a man and woman independently at random.
Find the probability that the woman is taller than the man.
Answer: Note that
W −M ∼ Nor(E[W −M ],Var(W −M))
∼ Nor(65− 68, 1 + 4) ∼ Nor(−3, 5).
Then
P (W > M) = P (W −M > 0)
= P
(Z >
0 + 3√5
)= 1− Φ(3/
√5)
.= 1− 0.910 = 0.090. 2
ISYE 6739 — Goldsman 8/5/20 69 / 108
![Page 397: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/397.jpg)
Standard Normal Distribution
Example: Suppose that heights of men are M ∼ Nor(68, 4) and heights ofwomen are W ∼ Nor(65, 1).
Select a man and woman independently at random.
Find the probability that the woman is taller than the man.
Answer: Note that
W −M ∼ Nor(E[W −M ],Var(W −M))
∼ Nor(65− 68, 1 + 4) ∼ Nor(−3, 5).
Then
P (W > M) = P (W −M > 0)
= P
(Z >
0 + 3√5
)= 1− Φ(3/
√5)
.= 1− 0.910 = 0.090. 2
ISYE 6739 — Goldsman 8/5/20 69 / 108
![Page 398: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/398.jpg)
Standard Normal Distribution
Example: Suppose that heights of men are M ∼ Nor(68, 4) and heights ofwomen are W ∼ Nor(65, 1).
Select a man and woman independently at random.
Find the probability that the woman is taller than the man.
Answer: Note that
W −M ∼ Nor(E[W −M ],Var(W −M))
∼ Nor(65− 68, 1 + 4) ∼ Nor(−3, 5).
Then
P (W > M) = P (W −M > 0)
= P
(Z >
0 + 3√5
)= 1− Φ(3/
√5)
.= 1− 0.910 = 0.090. 2
ISYE 6739 — Goldsman 8/5/20 69 / 108
![Page 399: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/399.jpg)
Standard Normal Distribution
Example: Suppose that heights of men are M ∼ Nor(68, 4) and heights ofwomen are W ∼ Nor(65, 1).
Select a man and woman independently at random.
Find the probability that the woman is taller than the man.
Answer: Note that
W −M ∼ Nor(E[W −M ],Var(W −M))
∼ Nor(65− 68, 1 + 4) ∼ Nor(−3, 5).
Then
P (W > M) = P (W −M > 0)
= P
(Z >
0 + 3√5
)
= 1− Φ(3/√
5).= 1− 0.910 = 0.090. 2
ISYE 6739 — Goldsman 8/5/20 69 / 108
![Page 400: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/400.jpg)
Standard Normal Distribution
Example: Suppose that heights of men are M ∼ Nor(68, 4) and heights ofwomen are W ∼ Nor(65, 1).
Select a man and woman independently at random.
Find the probability that the woman is taller than the man.
Answer: Note that
W −M ∼ Nor(E[W −M ],Var(W −M))
∼ Nor(65− 68, 1 + 4) ∼ Nor(−3, 5).
Then
P (W > M) = P (W −M > 0)
= P
(Z >
0 + 3√5
)= 1− Φ(3/
√5)
.= 1− 0.910 = 0.090. 2
ISYE 6739 — Goldsman 8/5/20 69 / 108
![Page 401: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/401.jpg)
Standard Normal Distribution
Example: Suppose that heights of men are M ∼ Nor(68, 4) and heights ofwomen are W ∼ Nor(65, 1).
Select a man and woman independently at random.
Find the probability that the woman is taller than the man.
Answer: Note that
W −M ∼ Nor(E[W −M ],Var(W −M))
∼ Nor(65− 68, 1 + 4) ∼ Nor(−3, 5).
Then
P (W > M) = P (W −M > 0)
= P
(Z >
0 + 3√5
)= 1− Φ(3/
√5)
.= 1− 0.910 = 0.090. 2
ISYE 6739 — Goldsman 8/5/20 69 / 108
![Page 402: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/402.jpg)
Sample Mean of Normals
1 Bernoulli and Binomial Distributions
2 Hypergeometric Distribution3 Geometric and Negative Binomial Distributions
4 Poisson Distribution5 Uniform, Exponential, and Friends6 Other Continuous Distributions7 Normal Distribution: Basics8 Standard Normal Distribution9 Sample Mean of Normals
10 The Central Limit Theorem + Proof
11 Central Limit Theorem Examples
12 Extensions — Multivariate Normal Distribution13 Extensions — Lognormal Distribution
14 Computer Stuff
ISYE 6739 — Goldsman 8/5/20 70 / 108
![Page 403: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/403.jpg)
Sample Mean of Normals
Lesson 4.9 — Sample Mean of Normals
The sample mean of X1, . . . , Xn is X̄ ≡∑n
i=1Xi/n.
Corollary (of old theorem):X1, . . . , Xn
iid∼ Nor(µ, σ2)⇒ X̄ ∼ Nor(µ, σ2/n).
Proof: By previous work, as long as X1, . . . , Xn are iid something, we haveE[X̄] = µ and Var(X̄) = σ2/n. Since X̄ is a linear combination ofindependent normals, it’s also normal. Done. 2
Remark: This result is very significant! As the number of observationsincreases, Var(X̄) gets smaller (while E[X̄] remains constant). In fact, it’scalled the Law of Large Numbers.
In the upcoming statistics portion of the course, we’ll learn that this makes X̄an excellent estimator for the mean µ, which is typically unknown inpractice.
ISYE 6739 — Goldsman 8/5/20 71 / 108
![Page 404: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/404.jpg)
Sample Mean of Normals
Lesson 4.9 — Sample Mean of Normals
The sample mean of X1, . . . , Xn is X̄ ≡∑n
i=1Xi/n.
Corollary (of old theorem):X1, . . . , Xn
iid∼ Nor(µ, σ2)⇒ X̄ ∼ Nor(µ, σ2/n).
Proof: By previous work, as long as X1, . . . , Xn are iid something, we haveE[X̄] = µ and Var(X̄) = σ2/n. Since X̄ is a linear combination ofindependent normals, it’s also normal. Done. 2
Remark: This result is very significant! As the number of observationsincreases, Var(X̄) gets smaller (while E[X̄] remains constant). In fact, it’scalled the Law of Large Numbers.
In the upcoming statistics portion of the course, we’ll learn that this makes X̄an excellent estimator for the mean µ, which is typically unknown inpractice.
ISYE 6739 — Goldsman 8/5/20 71 / 108
![Page 405: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/405.jpg)
Sample Mean of Normals
Lesson 4.9 — Sample Mean of Normals
The sample mean of X1, . . . , Xn is X̄ ≡∑n
i=1Xi/n.
Corollary (of old theorem):X1, . . . , Xn
iid∼ Nor(µ, σ2)⇒ X̄ ∼ Nor(µ, σ2/n).
Proof: By previous work, as long as X1, . . . , Xn are iid something, we haveE[X̄] = µ and Var(X̄) = σ2/n. Since X̄ is a linear combination ofindependent normals, it’s also normal. Done. 2
Remark: This result is very significant! As the number of observationsincreases, Var(X̄) gets smaller (while E[X̄] remains constant). In fact, it’scalled the Law of Large Numbers.
In the upcoming statistics portion of the course, we’ll learn that this makes X̄an excellent estimator for the mean µ, which is typically unknown inpractice.
ISYE 6739 — Goldsman 8/5/20 71 / 108
![Page 406: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/406.jpg)
Sample Mean of Normals
Lesson 4.9 — Sample Mean of Normals
The sample mean of X1, . . . , Xn is X̄ ≡∑n
i=1Xi/n.
Corollary (of old theorem):X1, . . . , Xn
iid∼ Nor(µ, σ2)⇒ X̄ ∼ Nor(µ, σ2/n).
Proof: By previous work, as long as X1, . . . , Xn are iid something, we have
E[X̄] = µ and Var(X̄) = σ2/n. Since X̄ is a linear combination ofindependent normals, it’s also normal. Done. 2
Remark: This result is very significant! As the number of observationsincreases, Var(X̄) gets smaller (while E[X̄] remains constant). In fact, it’scalled the Law of Large Numbers.
In the upcoming statistics portion of the course, we’ll learn that this makes X̄an excellent estimator for the mean µ, which is typically unknown inpractice.
ISYE 6739 — Goldsman 8/5/20 71 / 108
![Page 407: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/407.jpg)
Sample Mean of Normals
Lesson 4.9 — Sample Mean of Normals
The sample mean of X1, . . . , Xn is X̄ ≡∑n
i=1Xi/n.
Corollary (of old theorem):X1, . . . , Xn
iid∼ Nor(µ, σ2)⇒ X̄ ∼ Nor(µ, σ2/n).
Proof: By previous work, as long as X1, . . . , Xn are iid something, we haveE[X̄] = µ and Var(X̄) = σ2/n.
Since X̄ is a linear combination ofindependent normals, it’s also normal. Done. 2
Remark: This result is very significant! As the number of observationsincreases, Var(X̄) gets smaller (while E[X̄] remains constant). In fact, it’scalled the Law of Large Numbers.
In the upcoming statistics portion of the course, we’ll learn that this makes X̄an excellent estimator for the mean µ, which is typically unknown inpractice.
ISYE 6739 — Goldsman 8/5/20 71 / 108
![Page 408: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/408.jpg)
Sample Mean of Normals
Lesson 4.9 — Sample Mean of Normals
The sample mean of X1, . . . , Xn is X̄ ≡∑n
i=1Xi/n.
Corollary (of old theorem):X1, . . . , Xn
iid∼ Nor(µ, σ2)⇒ X̄ ∼ Nor(µ, σ2/n).
Proof: By previous work, as long as X1, . . . , Xn are iid something, we haveE[X̄] = µ and Var(X̄) = σ2/n. Since X̄ is a linear combination ofindependent normals, it’s also normal. Done. 2
Remark: This result is very significant! As the number of observationsincreases, Var(X̄) gets smaller (while E[X̄] remains constant). In fact, it’scalled the Law of Large Numbers.
In the upcoming statistics portion of the course, we’ll learn that this makes X̄an excellent estimator for the mean µ, which is typically unknown inpractice.
ISYE 6739 — Goldsman 8/5/20 71 / 108
![Page 409: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/409.jpg)
Sample Mean of Normals
Lesson 4.9 — Sample Mean of Normals
The sample mean of X1, . . . , Xn is X̄ ≡∑n
i=1Xi/n.
Corollary (of old theorem):X1, . . . , Xn
iid∼ Nor(µ, σ2)⇒ X̄ ∼ Nor(µ, σ2/n).
Proof: By previous work, as long as X1, . . . , Xn are iid something, we haveE[X̄] = µ and Var(X̄) = σ2/n. Since X̄ is a linear combination ofindependent normals, it’s also normal. Done. 2
Remark: This result is very significant! As the number of observationsincreases, Var(X̄) gets smaller (while E[X̄] remains constant). In fact, it’scalled the Law of Large Numbers.
In the upcoming statistics portion of the course, we’ll learn that this makes X̄an excellent estimator for the mean µ, which is typically unknown inpractice.
ISYE 6739 — Goldsman 8/5/20 71 / 108
![Page 410: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/410.jpg)
Sample Mean of Normals
Lesson 4.9 — Sample Mean of Normals
The sample mean of X1, . . . , Xn is X̄ ≡∑n
i=1Xi/n.
Corollary (of old theorem):X1, . . . , Xn
iid∼ Nor(µ, σ2)⇒ X̄ ∼ Nor(µ, σ2/n).
Proof: By previous work, as long as X1, . . . , Xn are iid something, we haveE[X̄] = µ and Var(X̄) = σ2/n. Since X̄ is a linear combination ofindependent normals, it’s also normal. Done. 2
Remark: This result is very significant! As the number of observationsincreases, Var(X̄) gets smaller (while E[X̄] remains constant). In fact, it’scalled the Law of Large Numbers.
In the upcoming statistics portion of the course, we’ll learn that this makes X̄an excellent estimator for the mean µ, which is typically unknown inpractice.
ISYE 6739 — Goldsman 8/5/20 71 / 108
![Page 411: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/411.jpg)
Sample Mean of Normals
Example: Suppose that X1, . . . , Xniid∼ Nor(µ, 16). Find the sample size n
such thatP (|X̄ − µ| ≤ 1) ≥ 0.95.
How many observations should you take so that X̄ will have a good chance ofbeing close to µ?
Solution: Note that X̄ ∼ Nor(µ, 16/n). Then
P (|X̄ − µ| ≤ 1) = P (−1 ≤ X̄ − µ ≤ 1)
= P
(−1
4/√n≤ X̄ − µ
4/√n≤ 1
4/√n
)= P
(−√n
4≤ Z ≤
√n
4
)= 2Φ(
√n/4)− 1.
ISYE 6739 — Goldsman 8/5/20 72 / 108
![Page 412: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/412.jpg)
Sample Mean of Normals
Example: Suppose that X1, . . . , Xniid∼ Nor(µ, 16). Find the sample size n
such thatP (|X̄ − µ| ≤ 1) ≥ 0.95.
How many observations should you take so that X̄ will have a good chance ofbeing close to µ?
Solution: Note that X̄ ∼ Nor(µ, 16/n). Then
P (|X̄ − µ| ≤ 1) = P (−1 ≤ X̄ − µ ≤ 1)
= P
(−1
4/√n≤ X̄ − µ
4/√n≤ 1
4/√n
)= P
(−√n
4≤ Z ≤
√n
4
)= 2Φ(
√n/4)− 1.
ISYE 6739 — Goldsman 8/5/20 72 / 108
![Page 413: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/413.jpg)
Sample Mean of Normals
Example: Suppose that X1, . . . , Xniid∼ Nor(µ, 16). Find the sample size n
such thatP (|X̄ − µ| ≤ 1) ≥ 0.95.
How many observations should you take so that X̄ will have a good chance ofbeing close to µ?
Solution: Note that X̄ ∼ Nor(µ, 16/n). Then
P (|X̄ − µ| ≤ 1) = P (−1 ≤ X̄ − µ ≤ 1)
= P
(−1
4/√n≤ X̄ − µ
4/√n≤ 1
4/√n
)= P
(−√n
4≤ Z ≤
√n
4
)= 2Φ(
√n/4)− 1.
ISYE 6739 — Goldsman 8/5/20 72 / 108
![Page 414: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/414.jpg)
Sample Mean of Normals
Example: Suppose that X1, . . . , Xniid∼ Nor(µ, 16). Find the sample size n
such thatP (|X̄ − µ| ≤ 1) ≥ 0.95.
How many observations should you take so that X̄ will have a good chance ofbeing close to µ?
Solution: Note that X̄ ∼ Nor(µ, 16/n). Then
P (|X̄ − µ| ≤ 1) = P (−1 ≤ X̄ − µ ≤ 1)
= P
(−1
4/√n≤ X̄ − µ
4/√n≤ 1
4/√n
)= P
(−√n
4≤ Z ≤
√n
4
)= 2Φ(
√n/4)− 1.
ISYE 6739 — Goldsman 8/5/20 72 / 108
![Page 415: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/415.jpg)
Sample Mean of Normals
Example: Suppose that X1, . . . , Xniid∼ Nor(µ, 16). Find the sample size n
such thatP (|X̄ − µ| ≤ 1) ≥ 0.95.
How many observations should you take so that X̄ will have a good chance ofbeing close to µ?
Solution: Note that X̄ ∼ Nor(µ, 16/n). Then
P (|X̄ − µ| ≤ 1) = P (−1 ≤ X̄ − µ ≤ 1)
= P
(−1
4/√n≤ X̄ − µ
4/√n≤ 1
4/√n
)
= P
(−√n
4≤ Z ≤
√n
4
)= 2Φ(
√n/4)− 1.
ISYE 6739 — Goldsman 8/5/20 72 / 108
![Page 416: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/416.jpg)
Sample Mean of Normals
Example: Suppose that X1, . . . , Xniid∼ Nor(µ, 16). Find the sample size n
such thatP (|X̄ − µ| ≤ 1) ≥ 0.95.
How many observations should you take so that X̄ will have a good chance ofbeing close to µ?
Solution: Note that X̄ ∼ Nor(µ, 16/n). Then
P (|X̄ − µ| ≤ 1) = P (−1 ≤ X̄ − µ ≤ 1)
= P
(−1
4/√n≤ X̄ − µ
4/√n≤ 1
4/√n
)= P
(−√n
4≤ Z ≤
√n
4
)
= 2Φ(√n/4)− 1.
ISYE 6739 — Goldsman 8/5/20 72 / 108
![Page 417: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/417.jpg)
Sample Mean of Normals
Example: Suppose that X1, . . . , Xniid∼ Nor(µ, 16). Find the sample size n
such thatP (|X̄ − µ| ≤ 1) ≥ 0.95.
How many observations should you take so that X̄ will have a good chance ofbeing close to µ?
Solution: Note that X̄ ∼ Nor(µ, 16/n). Then
P (|X̄ − µ| ≤ 1) = P (−1 ≤ X̄ − µ ≤ 1)
= P
(−1
4/√n≤ X̄ − µ
4/√n≤ 1
4/√n
)= P
(−√n
4≤ Z ≤
√n
4
)= 2Φ(
√n/4)− 1.
ISYE 6739 — Goldsman 8/5/20 72 / 108
![Page 418: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/418.jpg)
Sample Mean of Normals
Now we have to find n such that this probability is at least 0.95. . . .
2Φ(√n/4)− 1 ≥ 0.95 iff
Φ(√n/4) ≥ 0.975 iff
√n
4≥ Φ−1(0.975) = 1.96
iff n ≥ 61.47 or 62.
So if you take the average of 62 observations, then X̄ has a 95% chance ofbeing within 1 of µ. 2
ISYE 6739 — Goldsman 8/5/20 73 / 108
![Page 419: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/419.jpg)
Sample Mean of Normals
Now we have to find n such that this probability is at least 0.95. . . .
2Φ(√n/4)− 1 ≥ 0.95 iff
Φ(√n/4) ≥ 0.975 iff
√n
4≥ Φ−1(0.975) = 1.96
iff n ≥ 61.47 or 62.
So if you take the average of 62 observations, then X̄ has a 95% chance ofbeing within 1 of µ. 2
ISYE 6739 — Goldsman 8/5/20 73 / 108
![Page 420: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/420.jpg)
Sample Mean of Normals
Now we have to find n such that this probability is at least 0.95. . . .
2Φ(√n/4)− 1 ≥ 0.95 iff
Φ(√n/4) ≥ 0.975 iff
√n
4≥ Φ−1(0.975) = 1.96
iff n ≥ 61.47 or 62.
So if you take the average of 62 observations, then X̄ has a 95% chance ofbeing within 1 of µ. 2
ISYE 6739 — Goldsman 8/5/20 73 / 108
![Page 421: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/421.jpg)
Sample Mean of Normals
Now we have to find n such that this probability is at least 0.95. . . .
2Φ(√n/4)− 1 ≥ 0.95 iff
Φ(√n/4) ≥ 0.975 iff
√n
4≥ Φ−1(0.975) = 1.96
iff n ≥ 61.47 or 62.
So if you take the average of 62 observations, then X̄ has a 95% chance ofbeing within 1 of µ. 2
ISYE 6739 — Goldsman 8/5/20 73 / 108
![Page 422: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/422.jpg)
Sample Mean of Normals
Now we have to find n such that this probability is at least 0.95. . . .
2Φ(√n/4)− 1 ≥ 0.95 iff
Φ(√n/4) ≥ 0.975 iff
√n
4≥ Φ−1(0.975) = 1.96
iff n ≥ 61.47 or 62.
So if you take the average of 62 observations, then X̄ has a 95% chance ofbeing within 1 of µ. 2
ISYE 6739 — Goldsman 8/5/20 73 / 108
![Page 423: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/423.jpg)
Sample Mean of Normals
Now we have to find n such that this probability is at least 0.95. . . .
2Φ(√n/4)− 1 ≥ 0.95 iff
Φ(√n/4) ≥ 0.975 iff
√n
4≥ Φ−1(0.975) = 1.96
iff n ≥ 61.47 or 62.
So if you take the average of 62 observations, then X̄ has a 95% chance ofbeing within 1 of µ. 2
ISYE 6739 — Goldsman 8/5/20 73 / 108
![Page 424: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/424.jpg)
The Central Limit Theorem + Proof
1 Bernoulli and Binomial Distributions
2 Hypergeometric Distribution3 Geometric and Negative Binomial Distributions
4 Poisson Distribution5 Uniform, Exponential, and Friends6 Other Continuous Distributions7 Normal Distribution: Basics8 Standard Normal Distribution9 Sample Mean of Normals
10 The Central Limit Theorem + Proof
11 Central Limit Theorem Examples
12 Extensions — Multivariate Normal Distribution13 Extensions — Lognormal Distribution
14 Computer Stuff
ISYE 6739 — Goldsman 8/5/20 74 / 108
![Page 425: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/425.jpg)
The Central Limit Theorem + Proof
Lesson 4.10 — The Central Limit Theorem + Proof
The Central Limit Theorem is the most-important theorem in probability andstatistics!
CLT: Suppose X1, . . . , Xn are iid with E[Xi] = µ and Var(Xi) = σ2. Thenas n→∞,
Zn =
∑ni=1Xi − nµσ√n
=X̄ − µσ/√n
=X̄ − E[X̄]√
Var(X̄)
d→ Nor(0, 1),
where “ d→ ” means that the cdf of Zn→ the Nor(0, 1) cdf.
ISYE 6739 — Goldsman 8/5/20 75 / 108
![Page 426: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/426.jpg)
The Central Limit Theorem + Proof
Lesson 4.10 — The Central Limit Theorem + Proof
The Central Limit Theorem is the most-important theorem in probability andstatistics!
CLT: Suppose X1, . . . , Xn are iid with E[Xi] = µ and Var(Xi) = σ2. Thenas n→∞,
Zn =
∑ni=1Xi − nµσ√n
=X̄ − µσ/√n
=X̄ − E[X̄]√
Var(X̄)
d→ Nor(0, 1),
where “ d→ ” means that the cdf of Zn→ the Nor(0, 1) cdf.
ISYE 6739 — Goldsman 8/5/20 75 / 108
![Page 427: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/427.jpg)
The Central Limit Theorem + Proof
Lesson 4.10 — The Central Limit Theorem + Proof
The Central Limit Theorem is the most-important theorem in probability andstatistics!
CLT: Suppose X1, . . . , Xn are iid with E[Xi] = µ and Var(Xi) = σ2. Thenas n→∞,
Zn =
∑ni=1Xi − nµσ√n
=X̄ − µσ/√n
=X̄ − E[X̄]√
Var(X̄)
d→ Nor(0, 1),
where “ d→ ” means that the cdf of Zn→ the Nor(0, 1) cdf.
ISYE 6739 — Goldsman 8/5/20 75 / 108
![Page 428: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/428.jpg)
The Central Limit Theorem + Proof
Lesson 4.10 — The Central Limit Theorem + Proof
The Central Limit Theorem is the most-important theorem in probability andstatistics!
CLT: Suppose X1, . . . , Xn are iid with E[Xi] = µ and Var(Xi) = σ2. Thenas n→∞,
Zn =
∑ni=1Xi − nµσ√n
=X̄ − µσ/√n
=X̄ − E[X̄]√
Var(X̄)
d→ Nor(0, 1),
where “ d→ ” means that the cdf of Zn→ the Nor(0, 1) cdf.
ISYE 6739 — Goldsman 8/5/20 75 / 108
![Page 429: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/429.jpg)
The Central Limit Theorem + Proof
Lesson 4.10 — The Central Limit Theorem + Proof
The Central Limit Theorem is the most-important theorem in probability andstatistics!
CLT: Suppose X1, . . . , Xn are iid with E[Xi] = µ and Var(Xi) = σ2. Thenas n→∞,
Zn =
∑ni=1Xi − nµσ√n
=X̄ − µσ/√n
=X̄ − E[X̄]√
Var(X̄)
d→ Nor(0, 1),
where “ d→ ” means that the cdf of Zn→ the Nor(0, 1) cdf.
ISYE 6739 — Goldsman 8/5/20 75 / 108
![Page 430: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/430.jpg)
The Central Limit Theorem + Proof
Lesson 4.10 — The Central Limit Theorem + Proof
The Central Limit Theorem is the most-important theorem in probability andstatistics!
CLT: Suppose X1, . . . , Xn are iid with E[Xi] = µ and Var(Xi) = σ2. Thenas n→∞,
Zn =
∑ni=1Xi − nµσ√n
=X̄ − µσ/√n
=X̄ − E[X̄]√
Var(X̄)
d→ Nor(0, 1),
where “ d→ ” means that the cdf of Zn→ the Nor(0, 1) cdf.
ISYE 6739 — Goldsman 8/5/20 75 / 108
![Page 431: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/431.jpg)
The Central Limit Theorem + Proof
Lesson 4.10 — The Central Limit Theorem + Proof
The Central Limit Theorem is the most-important theorem in probability andstatistics!
CLT: Suppose X1, . . . , Xn are iid with E[Xi] = µ and Var(Xi) = σ2. Thenas n→∞,
Zn =
∑ni=1Xi − nµσ√n
=X̄ − µσ/√n
=X̄ − E[X̄]√
Var(X̄)
d→ Nor(0, 1),
where “ d→ ” means that the cdf of Zn→ the Nor(0, 1) cdf.
ISYE 6739 — Goldsman 8/5/20 75 / 108
![Page 432: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/432.jpg)
The Central Limit Theorem + Proof
Slightly Honors Informal Proof: Suppose that the mgf MX(t) of the Xi’sexists and satisfies certain technical conditions that you don’t need to knowabout. (OK, MX(t) has to exist around t = 0, among other things.)
Moreover, without loss of generality (since we’re standardizing anyway) andfor notational convenience, we’ll assume that µ = 0 and σ2 = 1.
We will be done if we can show that the mgf of Zn converges to the mgf ofZ ∼ Nor(0, 1), i.e., we need to show that MZn(t)→ et
2/2 as n→∞.
To get things going, the mgf of Zn is
MZn(t) = M∑ni=1Xi/
√n (t)
= M∑ni=1Xi
(t/√n) (mgf of a linear function of a RV)
=[MX(t/
√n)]n (Xi’s are iid).
ISYE 6739 — Goldsman 8/5/20 76 / 108
![Page 433: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/433.jpg)
The Central Limit Theorem + Proof
Slightly Honors Informal Proof: Suppose that the mgf MX(t) of the Xi’sexists and satisfies certain technical conditions that you don’t need to knowabout. (OK, MX(t) has to exist around t = 0, among other things.)
Moreover, without loss of generality (since we’re standardizing anyway) andfor notational convenience, we’ll assume that µ = 0 and σ2 = 1.
We will be done if we can show that the mgf of Zn converges to the mgf ofZ ∼ Nor(0, 1), i.e., we need to show that MZn(t)→ et
2/2 as n→∞.
To get things going, the mgf of Zn is
MZn(t) = M∑ni=1Xi/
√n (t)
= M∑ni=1Xi
(t/√n) (mgf of a linear function of a RV)
=[MX(t/
√n)]n (Xi’s are iid).
ISYE 6739 — Goldsman 8/5/20 76 / 108
![Page 434: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/434.jpg)
The Central Limit Theorem + Proof
Slightly Honors Informal Proof: Suppose that the mgf MX(t) of the Xi’sexists and satisfies certain technical conditions that you don’t need to knowabout. (OK, MX(t) has to exist around t = 0, among other things.)
Moreover, without loss of generality (since we’re standardizing anyway) andfor notational convenience, we’ll assume that µ = 0 and σ2 = 1.
We will be done if we can show that the mgf of Zn converges to the mgf ofZ ∼ Nor(0, 1), i.e., we need to show that MZn(t)→ et
2/2 as n→∞.
To get things going, the mgf of Zn is
MZn(t) = M∑ni=1Xi/
√n (t)
= M∑ni=1Xi
(t/√n) (mgf of a linear function of a RV)
=[MX(t/
√n)]n (Xi’s are iid).
ISYE 6739 — Goldsman 8/5/20 76 / 108
![Page 435: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/435.jpg)
The Central Limit Theorem + Proof
Slightly Honors Informal Proof: Suppose that the mgf MX(t) of the Xi’sexists and satisfies certain technical conditions that you don’t need to knowabout. (OK, MX(t) has to exist around t = 0, among other things.)
Moreover, without loss of generality (since we’re standardizing anyway) andfor notational convenience, we’ll assume that µ = 0 and σ2 = 1.
We will be done if we can show that the mgf of Zn converges to the mgf ofZ ∼ Nor(0, 1), i.e., we need to show that MZn(t)→ et
2/2 as n→∞.
To get things going, the mgf of Zn is
MZn(t) = M∑ni=1Xi/
√n (t)
= M∑ni=1Xi
(t/√n) (mgf of a linear function of a RV)
=[MX(t/
√n)]n (Xi’s are iid).
ISYE 6739 — Goldsman 8/5/20 76 / 108
![Page 436: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/436.jpg)
The Central Limit Theorem + Proof
Slightly Honors Informal Proof: Suppose that the mgf MX(t) of the Xi’sexists and satisfies certain technical conditions that you don’t need to knowabout. (OK, MX(t) has to exist around t = 0, among other things.)
Moreover, without loss of generality (since we’re standardizing anyway) andfor notational convenience, we’ll assume that µ = 0 and σ2 = 1.
We will be done if we can show that the mgf of Zn converges to the mgf ofZ ∼ Nor(0, 1), i.e., we need to show that MZn(t)→ et
2/2 as n→∞.
To get things going, the mgf of Zn is
MZn(t) = M∑ni=1Xi/
√n (t)
= M∑ni=1Xi
(t/√n) (mgf of a linear function of a RV)
=[MX(t/
√n)]n (Xi’s are iid).
ISYE 6739 — Goldsman 8/5/20 76 / 108
![Page 437: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/437.jpg)
The Central Limit Theorem + Proof
Slightly Honors Informal Proof: Suppose that the mgf MX(t) of the Xi’sexists and satisfies certain technical conditions that you don’t need to knowabout. (OK, MX(t) has to exist around t = 0, among other things.)
Moreover, without loss of generality (since we’re standardizing anyway) andfor notational convenience, we’ll assume that µ = 0 and σ2 = 1.
We will be done if we can show that the mgf of Zn converges to the mgf ofZ ∼ Nor(0, 1), i.e., we need to show that MZn(t)→ et
2/2 as n→∞.
To get things going, the mgf of Zn is
MZn(t) = M∑ni=1Xi/
√n (t)
= M∑ni=1Xi
(t/√n) (mgf of a linear function of a RV)
=[MX(t/
√n)]n (Xi’s are iid).
ISYE 6739 — Goldsman 8/5/20 76 / 108
![Page 438: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/438.jpg)
The Central Limit Theorem + Proof
Thus, taking logs, our goal is to show that
limn→∞
n `n(MX(t/
√n))
= t2/2.
If we let y = 1/√n, our revised goal is to show that
limy→0
`n(MX(ty)
)y2
= t2/2.
Before proceeding further, note that
limy→0
`n(MX(ty)
)= `n
(MX(0)
)= `n(1) = 0 (3)
andlimy→0
M ′X(ty) = M ′X(0) = E[X] = µ = 0, (4)
where the last equality is from our standardization assumption.
ISYE 6739 — Goldsman 8/5/20 77 / 108
![Page 439: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/439.jpg)
The Central Limit Theorem + Proof
Thus, taking logs, our goal is to show that
limn→∞
n `n(MX(t/
√n))
= t2/2.
If we let y = 1/√n, our revised goal is to show that
limy→0
`n(MX(ty)
)y2
= t2/2.
Before proceeding further, note that
limy→0
`n(MX(ty)
)= `n
(MX(0)
)= `n(1) = 0 (3)
andlimy→0
M ′X(ty) = M ′X(0) = E[X] = µ = 0, (4)
where the last equality is from our standardization assumption.
ISYE 6739 — Goldsman 8/5/20 77 / 108
![Page 440: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/440.jpg)
The Central Limit Theorem + Proof
Thus, taking logs, our goal is to show that
limn→∞
n `n(MX(t/
√n))
= t2/2.
If we let y = 1/√n, our revised goal is to show that
limy→0
`n(MX(ty)
)y2
= t2/2.
Before proceeding further, note that
limy→0
`n(MX(ty)
)= `n
(MX(0)
)= `n(1) = 0 (3)
andlimy→0
M ′X(ty) = M ′X(0) = E[X] = µ = 0, (4)
where the last equality is from our standardization assumption.
ISYE 6739 — Goldsman 8/5/20 77 / 108
![Page 441: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/441.jpg)
The Central Limit Theorem + Proof
Thus, taking logs, our goal is to show that
limn→∞
n `n(MX(t/
√n))
= t2/2.
If we let y = 1/√n, our revised goal is to show that
limy→0
`n(MX(ty)
)y2
= t2/2.
Before proceeding further, note that
limy→0
`n(MX(ty)
)= `n
(MX(0)
)= `n(1) = 0 (3)
andlimy→0
M ′X(ty) = M ′X(0) = E[X] = µ = 0, (4)
where the last equality is from our standardization assumption.
ISYE 6739 — Goldsman 8/5/20 77 / 108
![Page 442: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/442.jpg)
The Central Limit Theorem + Proof
Thus, taking logs, our goal is to show that
limn→∞
n `n(MX(t/
√n))
= t2/2.
If we let y = 1/√n, our revised goal is to show that
limy→0
`n(MX(ty)
)y2
= t2/2.
Before proceeding further, note that
limy→0
`n(MX(ty)
)= `n
(MX(0)
)= `n(1) = 0 (3)
andlimy→0
M ′X(ty) = M ′X(0) = E[X] = µ = 0, (4)
where the last equality is from our standardization assumption.
ISYE 6739 — Goldsman 8/5/20 77 / 108
![Page 443: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/443.jpg)
The Central Limit Theorem + Proof
Thus, taking logs, our goal is to show that
limn→∞
n `n(MX(t/
√n))
= t2/2.
If we let y = 1/√n, our revised goal is to show that
limy→0
`n(MX(ty)
)y2
= t2/2.
Before proceeding further, note that
limy→0
`n(MX(ty)
)= `n
(MX(0)
)= `n(1) = 0 (3)
andlimy→0
M ′X(ty) = M ′X(0) = E[X] = µ = 0, (4)
where the last equality is from our standardization assumption.
ISYE 6739 — Goldsman 8/5/20 77 / 108
![Page 444: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/444.jpg)
The Central Limit Theorem + Proof
So after all of this build-up, we have
limy→0
`n(MX(ty)
)y2
= limy→0
tM ′X(ty)
2yMX(ty)(by (3) et L’Hôspital to deal with 0 / 0)
= limy→0
t2M ′′X(ty)
2MX(ty) + 2ytM ′X(ty)(by (4) et L’Hôspital encore)
=t2M ′′X(0)
2MX(0) + 0=
t2 E[X2]
2=
t2
2(E[X2] = σ2 − µ2 = 1).
ISYE 6739 — Goldsman 8/5/20 78 / 108
![Page 445: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/445.jpg)
The Central Limit Theorem + Proof
So after all of this build-up, we have
limy→0
`n(MX(ty)
)y2
= limy→0
tM ′X(ty)
2yMX(ty)(by (3) et L’Hôspital to deal with 0 / 0)
= limy→0
t2M ′′X(ty)
2MX(ty) + 2ytM ′X(ty)(by (4) et L’Hôspital encore)
=t2M ′′X(0)
2MX(0) + 0=
t2 E[X2]
2=
t2
2(E[X2] = σ2 − µ2 = 1).
ISYE 6739 — Goldsman 8/5/20 78 / 108
![Page 446: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/446.jpg)
The Central Limit Theorem + Proof
So after all of this build-up, we have
limy→0
`n(MX(ty)
)y2
= limy→0
tM ′X(ty)
2yMX(ty)(by (3) et L’Hôspital to deal with 0 / 0)
= limy→0
t2M ′′X(ty)
2MX(ty) + 2ytM ′X(ty)(by (4) et L’Hôspital encore)
=t2M ′′X(0)
2MX(0) + 0
=t2 E[X2]
2=
t2
2(E[X2] = σ2 − µ2 = 1).
ISYE 6739 — Goldsman 8/5/20 78 / 108
![Page 447: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/447.jpg)
The Central Limit Theorem + Proof
So after all of this build-up, we have
limy→0
`n(MX(ty)
)y2
= limy→0
tM ′X(ty)
2yMX(ty)(by (3) et L’Hôspital to deal with 0 / 0)
= limy→0
t2M ′′X(ty)
2MX(ty) + 2ytM ′X(ty)(by (4) et L’Hôspital encore)
=t2M ′′X(0)
2MX(0) + 0=
t2 E[X2]
2
=t2
2(E[X2] = σ2 − µ2 = 1).
ISYE 6739 — Goldsman 8/5/20 78 / 108
![Page 448: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/448.jpg)
The Central Limit Theorem + Proof
So after all of this build-up, we have
limy→0
`n(MX(ty)
)y2
= limy→0
tM ′X(ty)
2yMX(ty)(by (3) et L’Hôspital to deal with 0 / 0)
= limy→0
t2M ′′X(ty)
2MX(ty) + 2ytM ′X(ty)(by (4) et L’Hôspital encore)
=t2M ′′X(0)
2MX(0) + 0=
t2 E[X2]
2=
t2
2(E[X2] = σ2 − µ2 = 1).
ISYE 6739 — Goldsman 8/5/20 78 / 108
![Page 449: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/449.jpg)
The Central Limit Theorem + Proof
Remarks: We have a lot to say about such an important theorem.
(1) If n is large, then X̄ ≈ Nor(µ, σ2/n).
(2) The Xi’s don’t have to be normal for the CLT to work! It even workson discrete distributions!
(3) You usually need n ≥ 30 observations for the approximation to work well.(Need fewer observations if the Xi’s come from a symmetric distribution.)
(4) You can almost always use the CLT if the observations are iid.
(5) In fact, there are versions of the CLT that are actually a lot more generalthan the theorem presented here!
ISYE 6739 — Goldsman 8/5/20 79 / 108
![Page 450: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/450.jpg)
The Central Limit Theorem + Proof
Remarks: We have a lot to say about such an important theorem.
(1) If n is large, then X̄ ≈ Nor(µ, σ2/n).
(2) The Xi’s don’t have to be normal for the CLT to work! It even workson discrete distributions!
(3) You usually need n ≥ 30 observations for the approximation to work well.(Need fewer observations if the Xi’s come from a symmetric distribution.)
(4) You can almost always use the CLT if the observations are iid.
(5) In fact, there are versions of the CLT that are actually a lot more generalthan the theorem presented here!
ISYE 6739 — Goldsman 8/5/20 79 / 108
![Page 451: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/451.jpg)
The Central Limit Theorem + Proof
Remarks: We have a lot to say about such an important theorem.
(1) If n is large, then X̄ ≈ Nor(µ, σ2/n).
(2) The Xi’s don’t have to be normal for the CLT to work! It even workson discrete distributions!
(3) You usually need n ≥ 30 observations for the approximation to work well.(Need fewer observations if the Xi’s come from a symmetric distribution.)
(4) You can almost always use the CLT if the observations are iid.
(5) In fact, there are versions of the CLT that are actually a lot more generalthan the theorem presented here!
ISYE 6739 — Goldsman 8/5/20 79 / 108
![Page 452: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/452.jpg)
The Central Limit Theorem + Proof
Remarks: We have a lot to say about such an important theorem.
(1) If n is large, then X̄ ≈ Nor(µ, σ2/n).
(2) The Xi’s don’t have to be normal for the CLT to work! It even workson discrete distributions!
(3) You usually need n ≥ 30 observations for the approximation to work well.(Need fewer observations if the Xi’s come from a symmetric distribution.)
(4) You can almost always use the CLT if the observations are iid.
(5) In fact, there are versions of the CLT that are actually a lot more generalthan the theorem presented here!
ISYE 6739 — Goldsman 8/5/20 79 / 108
![Page 453: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/453.jpg)
The Central Limit Theorem + Proof
Remarks: We have a lot to say about such an important theorem.
(1) If n is large, then X̄ ≈ Nor(µ, σ2/n).
(2) The Xi’s don’t have to be normal for the CLT to work! It even workson discrete distributions!
(3) You usually need n ≥ 30 observations for the approximation to work well.(Need fewer observations if the Xi’s come from a symmetric distribution.)
(4) You can almost always use the CLT if the observations are iid.
(5) In fact, there are versions of the CLT that are actually a lot more generalthan the theorem presented here!
ISYE 6739 — Goldsman 8/5/20 79 / 108
![Page 454: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/454.jpg)
The Central Limit Theorem + Proof
Remarks: We have a lot to say about such an important theorem.
(1) If n is large, then X̄ ≈ Nor(µ, σ2/n).
(2) The Xi’s don’t have to be normal for the CLT to work! It even workson discrete distributions!
(3) You usually need n ≥ 30 observations for the approximation to work well.(Need fewer observations if the Xi’s come from a symmetric distribution.)
(4) You can almost always use the CLT if the observations are iid.
(5) In fact, there are versions of the CLT that are actually a lot more generalthan the theorem presented here!
ISYE 6739 — Goldsman 8/5/20 79 / 108
![Page 455: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/455.jpg)
Central Limit Theorem Examples
1 Bernoulli and Binomial Distributions
2 Hypergeometric Distribution3 Geometric and Negative Binomial Distributions
4 Poisson Distribution5 Uniform, Exponential, and Friends6 Other Continuous Distributions7 Normal Distribution: Basics8 Standard Normal Distribution9 Sample Mean of Normals
10 The Central Limit Theorem + Proof
11 Central Limit Theorem Examples
12 Extensions — Multivariate Normal Distribution13 Extensions — Lognormal Distribution
14 Computer Stuff
ISYE 6739 — Goldsman 8/5/20 80 / 108
![Page 456: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/456.jpg)
Central Limit Theorem Examples
Lesson 4.11 — Central Limit Theorem Examples
Example: To show that the Central Limit Theory really works, let’s add upjust n = 12 iid Unif(0,1)’s, U1, . . . , Un. Let Sn =
∑ni=1 Ui. Note that
E[Sn] = nE[Ui] = n/2, and Var(Sn) = nVar(Ui) = n/12. Therefore,
Zn ≡Sn − n/2√
n/12= Sn − 6 ≈ Nor(0, 1).
The histogram was compiled using 100,000 simulations of Z12. It works! 2
ISYE 6739 — Goldsman 8/5/20 81 / 108
![Page 457: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/457.jpg)
Central Limit Theorem Examples
Lesson 4.11 — Central Limit Theorem Examples
Example: To show that the Central Limit Theory really works, let’s add upjust n = 12 iid Unif(0,1)’s, U1, . . . , Un. Let Sn =
∑ni=1 Ui.
Note thatE[Sn] = nE[Ui] = n/2, and Var(Sn) = nVar(Ui) = n/12. Therefore,
Zn ≡Sn − n/2√
n/12= Sn − 6 ≈ Nor(0, 1).
The histogram was compiled using 100,000 simulations of Z12. It works! 2
ISYE 6739 — Goldsman 8/5/20 81 / 108
![Page 458: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/458.jpg)
Central Limit Theorem Examples
Lesson 4.11 — Central Limit Theorem Examples
Example: To show that the Central Limit Theory really works, let’s add upjust n = 12 iid Unif(0,1)’s, U1, . . . , Un. Let Sn =
∑ni=1 Ui. Note that
E[Sn] = nE[Ui] = n/2, and Var(Sn) = nVar(Ui) = n/12. Therefore,
Zn ≡Sn − n/2√
n/12= Sn − 6 ≈ Nor(0, 1).
The histogram was compiled using 100,000 simulations of Z12. It works! 2
ISYE 6739 — Goldsman 8/5/20 81 / 108
![Page 459: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/459.jpg)
Central Limit Theorem Examples
Lesson 4.11 — Central Limit Theorem Examples
Example: To show that the Central Limit Theory really works, let’s add upjust n = 12 iid Unif(0,1)’s, U1, . . . , Un. Let Sn =
∑ni=1 Ui. Note that
E[Sn] = nE[Ui] = n/2, and Var(Sn) = nVar(Ui) = n/12. Therefore,
Zn ≡Sn − n/2√
n/12= Sn − 6 ≈ Nor(0, 1).
The histogram was compiled using 100,000 simulations of Z12. It works! 2
ISYE 6739 — Goldsman 8/5/20 81 / 108
![Page 460: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/460.jpg)
Central Limit Theorem Examples
Lesson 4.11 — Central Limit Theorem Examples
Example: To show that the Central Limit Theory really works, let’s add upjust n = 12 iid Unif(0,1)’s, U1, . . . , Un. Let Sn =
∑ni=1 Ui. Note that
E[Sn] = nE[Ui] = n/2, and Var(Sn) = nVar(Ui) = n/12. Therefore,
Zn ≡Sn − n/2√
n/12= Sn − 6 ≈ Nor(0, 1).
The histogram was compiled using 100,000 simulations of Z12. It works! 2
ISYE 6739 — Goldsman 8/5/20 81 / 108
![Page 461: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/461.jpg)
Central Limit Theorem Examples
Example: Suppose X1, . . . , X100iid∼ Exp(1/1000).
Find P (950 ≤ X̄ ≤ 1050).
Solution: Recall that if Xi ∼ Exp(λ), then E[Xi] = 1/λ andVar(Xi) = 1/λ2.
Further, if X̄ is the sample mean based on n observations, then
E[X̄] = E[Xi] = 1/λ and
Var(X̄) = Var(Xi)/n = 1/(nλ2).
For our problem, λ = 1/1000 and n = 100, so that E[X̄] = 1000 andVar(X̄) = 10000.
ISYE 6739 — Goldsman 8/5/20 82 / 108
![Page 462: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/462.jpg)
Central Limit Theorem Examples
Example: Suppose X1, . . . , X100iid∼ Exp(1/1000).
Find P (950 ≤ X̄ ≤ 1050).
Solution: Recall that if Xi ∼ Exp(λ), then E[Xi] = 1/λ andVar(Xi) = 1/λ2.
Further, if X̄ is the sample mean based on n observations, then
E[X̄] = E[Xi] = 1/λ and
Var(X̄) = Var(Xi)/n = 1/(nλ2).
For our problem, λ = 1/1000 and n = 100, so that E[X̄] = 1000 andVar(X̄) = 10000.
ISYE 6739 — Goldsman 8/5/20 82 / 108
![Page 463: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/463.jpg)
Central Limit Theorem Examples
Example: Suppose X1, . . . , X100iid∼ Exp(1/1000).
Find P (950 ≤ X̄ ≤ 1050).
Solution: Recall that if Xi ∼ Exp(λ), then E[Xi] = 1/λ andVar(Xi) = 1/λ2.
Further, if X̄ is the sample mean based on n observations, then
E[X̄] = E[Xi] = 1/λ and
Var(X̄) = Var(Xi)/n = 1/(nλ2).
For our problem, λ = 1/1000 and n = 100, so that E[X̄] = 1000 andVar(X̄) = 10000.
ISYE 6739 — Goldsman 8/5/20 82 / 108
![Page 464: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/464.jpg)
Central Limit Theorem Examples
Example: Suppose X1, . . . , X100iid∼ Exp(1/1000).
Find P (950 ≤ X̄ ≤ 1050).
Solution: Recall that if Xi ∼ Exp(λ), then E[Xi] = 1/λ andVar(Xi) = 1/λ2.
Further, if X̄ is the sample mean based on n observations, then
E[X̄] = E[Xi] = 1/λ and
Var(X̄) = Var(Xi)/n = 1/(nλ2).
For our problem, λ = 1/1000 and n = 100, so that E[X̄] = 1000 andVar(X̄) = 10000.
ISYE 6739 — Goldsman 8/5/20 82 / 108
![Page 465: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/465.jpg)
Central Limit Theorem Examples
Example: Suppose X1, . . . , X100iid∼ Exp(1/1000).
Find P (950 ≤ X̄ ≤ 1050).
Solution: Recall that if Xi ∼ Exp(λ), then E[Xi] = 1/λ andVar(Xi) = 1/λ2.
Further, if X̄ is the sample mean based on n observations, then
E[X̄] = E[Xi] = 1/λ and
Var(X̄) = Var(Xi)/n = 1/(nλ2).
For our problem, λ = 1/1000 and n = 100, so that E[X̄] = 1000 andVar(X̄) = 10000.
ISYE 6739 — Goldsman 8/5/20 82 / 108
![Page 466: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/466.jpg)
Central Limit Theorem Examples
Example: Suppose X1, . . . , X100iid∼ Exp(1/1000).
Find P (950 ≤ X̄ ≤ 1050).
Solution: Recall that if Xi ∼ Exp(λ), then E[Xi] = 1/λ andVar(Xi) = 1/λ2.
Further, if X̄ is the sample mean based on n observations, then
E[X̄] = E[Xi] = 1/λ and
Var(X̄) = Var(Xi)/n = 1/(nλ2).
For our problem, λ = 1/1000 and n = 100, so that E[X̄] = 1000 andVar(X̄) = 10000.
ISYE 6739 — Goldsman 8/5/20 82 / 108
![Page 467: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/467.jpg)
Central Limit Theorem Examples
Example: Suppose X1, . . . , X100iid∼ Exp(1/1000).
Find P (950 ≤ X̄ ≤ 1050).
Solution: Recall that if Xi ∼ Exp(λ), then E[Xi] = 1/λ andVar(Xi) = 1/λ2.
Further, if X̄ is the sample mean based on n observations, then
E[X̄] = E[Xi] = 1/λ and
Var(X̄) = Var(Xi)/n = 1/(nλ2).
For our problem, λ = 1/1000 and n = 100, so that E[X̄] = 1000 andVar(X̄) = 10000.
ISYE 6739 — Goldsman 8/5/20 82 / 108
![Page 468: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/468.jpg)
Central Limit Theorem Examples
So by the CLT,
P (950 ≤ X̄ ≤ 1050)
= P
(950− E[X̄]√
Var(X̄)≤ X̄ − E[X̄]√
Var(X̄)≤ 1050− E[X̄]√
Var(X̄)
).= P
(950− 1000
100≤ Z ≤ 1050− 1000
100
)= P
(−1
2≤ Z ≤ 1
2
)= 2Φ(1/2)− 1 = 0.383. 2
Remark: This problem can be solved exactly if we have access to the ExcelErlang cdf function GAMMADIST. And what do you know, you end up withexactly the same answer of 0.383!
ISYE 6739 — Goldsman 8/5/20 83 / 108
![Page 469: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/469.jpg)
Central Limit Theorem Examples
So by the CLT,
P (950 ≤ X̄ ≤ 1050)
= P
(950− E[X̄]√
Var(X̄)≤ X̄ − E[X̄]√
Var(X̄)≤ 1050− E[X̄]√
Var(X̄)
)
.= P
(950− 1000
100≤ Z ≤ 1050− 1000
100
)= P
(−1
2≤ Z ≤ 1
2
)= 2Φ(1/2)− 1 = 0.383. 2
Remark: This problem can be solved exactly if we have access to the ExcelErlang cdf function GAMMADIST. And what do you know, you end up withexactly the same answer of 0.383!
ISYE 6739 — Goldsman 8/5/20 83 / 108
![Page 470: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/470.jpg)
Central Limit Theorem Examples
So by the CLT,
P (950 ≤ X̄ ≤ 1050)
= P
(950− E[X̄]√
Var(X̄)≤ X̄ − E[X̄]√
Var(X̄)≤ 1050− E[X̄]√
Var(X̄)
).= P
(950− 1000
100≤ Z ≤ 1050− 1000
100
)
= P
(−1
2≤ Z ≤ 1
2
)= 2Φ(1/2)− 1 = 0.383. 2
Remark: This problem can be solved exactly if we have access to the ExcelErlang cdf function GAMMADIST. And what do you know, you end up withexactly the same answer of 0.383!
ISYE 6739 — Goldsman 8/5/20 83 / 108
![Page 471: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/471.jpg)
Central Limit Theorem Examples
So by the CLT,
P (950 ≤ X̄ ≤ 1050)
= P
(950− E[X̄]√
Var(X̄)≤ X̄ − E[X̄]√
Var(X̄)≤ 1050− E[X̄]√
Var(X̄)
).= P
(950− 1000
100≤ Z ≤ 1050− 1000
100
)= P
(−1
2≤ Z ≤ 1
2
)= 2Φ(1/2)− 1 = 0.383. 2
Remark: This problem can be solved exactly if we have access to the ExcelErlang cdf function GAMMADIST. And what do you know, you end up withexactly the same answer of 0.383!
ISYE 6739 — Goldsman 8/5/20 83 / 108
![Page 472: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/472.jpg)
Central Limit Theorem Examples
So by the CLT,
P (950 ≤ X̄ ≤ 1050)
= P
(950− E[X̄]√
Var(X̄)≤ X̄ − E[X̄]√
Var(X̄)≤ 1050− E[X̄]√
Var(X̄)
).= P
(950− 1000
100≤ Z ≤ 1050− 1000
100
)= P
(−1
2≤ Z ≤ 1
2
)= 2Φ(1/2)− 1 = 0.383. 2
Remark: This problem can be solved exactly if we have access to the ExcelErlang cdf function GAMMADIST. And what do you know, you end up withexactly the same answer of 0.383!
ISYE 6739 — Goldsman 8/5/20 83 / 108
![Page 473: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/473.jpg)
Central Limit Theorem Examples
Example: Suppose X1, . . . , X100 are iid from some distribution with mean1000 and standard deviation 1000.
Find P (950 ≤ X̄ ≤ 1050).
Solution: By exactly the same manipulations as in the previous example, theanswer .= 0.383.
Notice that we didn’t care whether or not the data came from an exponentialdistribution. We just needed the mean and variance. 2
ISYE 6739 — Goldsman 8/5/20 84 / 108
![Page 474: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/474.jpg)
Central Limit Theorem Examples
Example: Suppose X1, . . . , X100 are iid from some distribution with mean1000 and standard deviation 1000. Find P (950 ≤ X̄ ≤ 1050).
Solution: By exactly the same manipulations as in the previous example, theanswer .= 0.383.
Notice that we didn’t care whether or not the data came from an exponentialdistribution. We just needed the mean and variance. 2
ISYE 6739 — Goldsman 8/5/20 84 / 108
![Page 475: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/475.jpg)
Central Limit Theorem Examples
Example: Suppose X1, . . . , X100 are iid from some distribution with mean1000 and standard deviation 1000. Find P (950 ≤ X̄ ≤ 1050).
Solution: By exactly the same manipulations as in the previous example, theanswer .= 0.383.
Notice that we didn’t care whether or not the data came from an exponentialdistribution. We just needed the mean and variance. 2
ISYE 6739 — Goldsman 8/5/20 84 / 108
![Page 476: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/476.jpg)
Central Limit Theorem Examples
Example: Suppose X1, . . . , X100 are iid from some distribution with mean1000 and standard deviation 1000. Find P (950 ≤ X̄ ≤ 1050).
Solution: By exactly the same manipulations as in the previous example, theanswer .= 0.383.
Notice that we didn’t care whether or not the data came from an exponentialdistribution. We just needed the mean and variance. 2
ISYE 6739 — Goldsman 8/5/20 84 / 108
![Page 477: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/477.jpg)
Central Limit Theorem Examples
Normal Approximation to the Binomial
Suppose Y ∼ Bin(n, p), where n is very large. In such cases, we usuallyapproximate the Binomial via an appropriate Normal distribution.
The CLT applies since Y =∑n
i=1Xi, where the Xi’s are iid Bern(p).
ThenY − E[Y ]√
Var(Y )=
Y − np√npq
≈ Nor(0, 1).
The usual rule of thumb for the Normal approximation to the Binomial is thatit works pretty well as long as np ≥ 5 and nq ≥ 5.
ISYE 6739 — Goldsman 8/5/20 85 / 108
![Page 478: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/478.jpg)
Central Limit Theorem Examples
Normal Approximation to the Binomial
Suppose Y ∼ Bin(n, p), where n is very large. In such cases, we usuallyapproximate the Binomial via an appropriate Normal distribution.
The CLT applies since Y =∑n
i=1Xi, where the Xi’s are iid Bern(p).
ThenY − E[Y ]√
Var(Y )=
Y − np√npq
≈ Nor(0, 1).
The usual rule of thumb for the Normal approximation to the Binomial is thatit works pretty well as long as np ≥ 5 and nq ≥ 5.
ISYE 6739 — Goldsman 8/5/20 85 / 108
![Page 479: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/479.jpg)
Central Limit Theorem Examples
Normal Approximation to the Binomial
Suppose Y ∼ Bin(n, p), where n is very large. In such cases, we usuallyapproximate the Binomial via an appropriate Normal distribution.
The CLT applies since Y =∑n
i=1Xi, where the Xi’s are iid Bern(p).
ThenY − E[Y ]√
Var(Y )=
Y − np√npq
≈ Nor(0, 1).
The usual rule of thumb for the Normal approximation to the Binomial is thatit works pretty well as long as np ≥ 5 and nq ≥ 5.
ISYE 6739 — Goldsman 8/5/20 85 / 108
![Page 480: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/480.jpg)
Central Limit Theorem Examples
Normal Approximation to the Binomial
Suppose Y ∼ Bin(n, p), where n is very large. In such cases, we usuallyapproximate the Binomial via an appropriate Normal distribution.
The CLT applies since Y =∑n
i=1Xi, where the Xi’s are iid Bern(p).
ThenY − E[Y ]√
Var(Y )=
Y − np√npq
≈ Nor(0, 1).
The usual rule of thumb for the Normal approximation to the Binomial is thatit works pretty well as long as np ≥ 5 and nq ≥ 5.
ISYE 6739 — Goldsman 8/5/20 85 / 108
![Page 481: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/481.jpg)
Central Limit Theorem Examples
Normal Approximation to the Binomial
Suppose Y ∼ Bin(n, p), where n is very large. In such cases, we usuallyapproximate the Binomial via an appropriate Normal distribution.
The CLT applies since Y =∑n
i=1Xi, where the Xi’s are iid Bern(p).
ThenY − E[Y ]√
Var(Y )=
Y − np√npq
≈ Nor(0, 1).
The usual rule of thumb for the Normal approximation to the Binomial is thatit works pretty well as long as np ≥ 5 and nq ≥ 5.
ISYE 6739 — Goldsman 8/5/20 85 / 108
![Page 482: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/482.jpg)
Central Limit Theorem Examples
Why do we need such an approximation?
Example: Suppose Y ∼ Bin(100, 0.8) and we want
P (Y ≥ 84) =100∑i=84
(100
i
)(0.8)i(0.2)100−i.
Good luck with the binomial coefficients (they’re too big) and the number ofterms to sum up (it’s going to get tedious). I’ll come back to visit you in anhour.
The next example shows how to use the approximation.
Note that it incorporates a “continuity correction” to account for the factthat the Binomial is discrete while the Normal is continuous. If you don’twant to use it, don’t worry too much.
ISYE 6739 — Goldsman 8/5/20 86 / 108
![Page 483: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/483.jpg)
Central Limit Theorem Examples
Why do we need such an approximation?
Example: Suppose Y ∼ Bin(100, 0.8) and we want
P (Y ≥ 84) =
100∑i=84
(100
i
)(0.8)i(0.2)100−i.
Good luck with the binomial coefficients (they’re too big) and the number ofterms to sum up (it’s going to get tedious). I’ll come back to visit you in anhour.
The next example shows how to use the approximation.
Note that it incorporates a “continuity correction” to account for the factthat the Binomial is discrete while the Normal is continuous. If you don’twant to use it, don’t worry too much.
ISYE 6739 — Goldsman 8/5/20 86 / 108
![Page 484: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/484.jpg)
Central Limit Theorem Examples
Why do we need such an approximation?
Example: Suppose Y ∼ Bin(100, 0.8) and we want
P (Y ≥ 84) =
100∑i=84
(100
i
)(0.8)i(0.2)100−i.
Good luck with the binomial coefficients (they’re too big) and the number ofterms to sum up (it’s going to get tedious). I’ll come back to visit you in anhour.
The next example shows how to use the approximation.
Note that it incorporates a “continuity correction” to account for the factthat the Binomial is discrete while the Normal is continuous. If you don’twant to use it, don’t worry too much.
ISYE 6739 — Goldsman 8/5/20 86 / 108
![Page 485: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/485.jpg)
Central Limit Theorem Examples
Why do we need such an approximation?
Example: Suppose Y ∼ Bin(100, 0.8) and we want
P (Y ≥ 84) =
100∑i=84
(100
i
)(0.8)i(0.2)100−i.
Good luck with the binomial coefficients (they’re too big) and the number ofterms to sum up (it’s going to get tedious). I’ll come back to visit you in anhour.
The next example shows how to use the approximation.
Note that it incorporates a “continuity correction” to account for the factthat the Binomial is discrete while the Normal is continuous. If you don’twant to use it, don’t worry too much.
ISYE 6739 — Goldsman 8/5/20 86 / 108
![Page 486: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/486.jpg)
Central Limit Theorem Examples
Why do we need such an approximation?
Example: Suppose Y ∼ Bin(100, 0.8) and we want
P (Y ≥ 84) =
100∑i=84
(100
i
)(0.8)i(0.2)100−i.
Good luck with the binomial coefficients (they’re too big) and the number ofterms to sum up (it’s going to get tedious). I’ll come back to visit you in anhour.
The next example shows how to use the approximation.
Note that it incorporates a “continuity correction” to account for the factthat the Binomial is discrete while the Normal is continuous. If you don’twant to use it, don’t worry too much.
ISYE 6739 — Goldsman 8/5/20 86 / 108
![Page 487: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/487.jpg)
Central Limit Theorem Examples
Example: The Braves play 100 independent baseball games, each of whichthey have probability 0.8 of winning. What’s the probability that they win ≥84?
Y ∼ Bin(100, 0.8) and we want P (Y ≥ 84) (as in the last example). . .
P (Y ≥ 84) = P (Y ≥ 83.5) (“continuity correction”)
.= P
(Z ≥ 83.5− np
√npq
)(CLT)
= P
(Z ≥ 83.5− 80√
16
)= P (Z ≥ 0.875) = 0.1908.
The actual answer (using the true Bin(100,0.8) distribution) turns out to be0.1923 — pretty close! 2
ISYE 6739 — Goldsman 8/5/20 87 / 108
![Page 488: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/488.jpg)
Central Limit Theorem Examples
Example: The Braves play 100 independent baseball games, each of whichthey have probability 0.8 of winning. What’s the probability that they win ≥84?
Y ∼ Bin(100, 0.8) and we want P (Y ≥ 84) (as in the last example). . .
P (Y ≥ 84) = P (Y ≥ 83.5) (“continuity correction”)
.= P
(Z ≥ 83.5− np
√npq
)(CLT)
= P
(Z ≥ 83.5− 80√
16
)= P (Z ≥ 0.875) = 0.1908.
The actual answer (using the true Bin(100,0.8) distribution) turns out to be0.1923 — pretty close! 2
ISYE 6739 — Goldsman 8/5/20 87 / 108
![Page 489: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/489.jpg)
Central Limit Theorem Examples
Example: The Braves play 100 independent baseball games, each of whichthey have probability 0.8 of winning. What’s the probability that they win ≥84?
Y ∼ Bin(100, 0.8) and we want P (Y ≥ 84) (as in the last example). . .
P (Y ≥ 84) = P (Y ≥ 83.5) (“continuity correction”)
.= P
(Z ≥ 83.5− np
√npq
)(CLT)
= P
(Z ≥ 83.5− 80√
16
)= P (Z ≥ 0.875) = 0.1908.
The actual answer (using the true Bin(100,0.8) distribution) turns out to be0.1923 — pretty close! 2
ISYE 6739 — Goldsman 8/5/20 87 / 108
![Page 490: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/490.jpg)
Central Limit Theorem Examples
Example: The Braves play 100 independent baseball games, each of whichthey have probability 0.8 of winning. What’s the probability that they win ≥84?
Y ∼ Bin(100, 0.8) and we want P (Y ≥ 84) (as in the last example). . .
P (Y ≥ 84) = P (Y ≥ 83.5) (“continuity correction”)
.= P
(Z ≥ 83.5− np
√npq
)(CLT)
= P
(Z ≥ 83.5− 80√
16
)= P (Z ≥ 0.875) = 0.1908.
The actual answer (using the true Bin(100,0.8) distribution) turns out to be0.1923 — pretty close! 2
ISYE 6739 — Goldsman 8/5/20 87 / 108
![Page 491: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/491.jpg)
Central Limit Theorem Examples
Example: The Braves play 100 independent baseball games, each of whichthey have probability 0.8 of winning. What’s the probability that they win ≥84?
Y ∼ Bin(100, 0.8) and we want P (Y ≥ 84) (as in the last example). . .
P (Y ≥ 84) = P (Y ≥ 83.5) (“continuity correction”)
.= P
(Z ≥ 83.5− np
√npq
)(CLT)
= P
(Z ≥ 83.5− 80√
16
)= P (Z ≥ 0.875) = 0.1908.
The actual answer (using the true Bin(100,0.8) distribution) turns out to be0.1923 — pretty close! 2
ISYE 6739 — Goldsman 8/5/20 87 / 108
![Page 492: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/492.jpg)
Central Limit Theorem Examples
Example: The Braves play 100 independent baseball games, each of whichthey have probability 0.8 of winning. What’s the probability that they win ≥84?
Y ∼ Bin(100, 0.8) and we want P (Y ≥ 84) (as in the last example). . .
P (Y ≥ 84) = P (Y ≥ 83.5) (“continuity correction”)
.= P
(Z ≥ 83.5− np
√npq
)(CLT)
= P
(Z ≥ 83.5− 80√
16
)= P (Z ≥ 0.875) = 0.1908.
The actual answer (using the true Bin(100,0.8) distribution) turns out to be0.1923 — pretty close! 2
ISYE 6739 — Goldsman 8/5/20 87 / 108
![Page 493: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/493.jpg)
Extensions — Multivariate Normal Distribution
1 Bernoulli and Binomial Distributions
2 Hypergeometric Distribution3 Geometric and Negative Binomial Distributions
4 Poisson Distribution5 Uniform, Exponential, and Friends6 Other Continuous Distributions7 Normal Distribution: Basics8 Standard Normal Distribution9 Sample Mean of Normals
10 The Central Limit Theorem + Proof
11 Central Limit Theorem Examples
12 Extensions — Multivariate Normal Distribution13 Extensions — Lognormal Distribution
14 Computer Stuff
ISYE 6739 — Goldsman 8/5/20 88 / 108
![Page 494: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/494.jpg)
Extensions — Multivariate Normal Distribution
Lesson 4.12 — Extensions — Multivariate Normal Distribution
Definition: (X,Y ) has the Bivariate Normal distribution if it has pdf
f(x, y) = C exp
−[z2X(x)− 2ρzX(x)zY (y) + z2Y (y)
]2(1− ρ2)
,
whereρ ≡ Corr(X,Y ), C ≡ 1
2πσXσY√
1− ρ2,
zX(x) ≡ x− µXσX
, and zY (y) ≡ y − µYσY
.
ISYE 6739 — Goldsman 8/5/20 89 / 108
![Page 495: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/495.jpg)
Extensions — Multivariate Normal Distribution
Lesson 4.12 — Extensions — Multivariate Normal Distribution
Definition: (X,Y ) has the Bivariate Normal distribution if it has pdf
f(x, y) = C exp
−[z2X(x)− 2ρzX(x)zY (y) + z2Y (y)
]2(1− ρ2)
,
whereρ ≡ Corr(X,Y ), C ≡ 1
2πσXσY√
1− ρ2,
zX(x) ≡ x− µXσX
, and zY (y) ≡ y − µYσY
.
ISYE 6739 — Goldsman 8/5/20 89 / 108
![Page 496: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/496.jpg)
Extensions — Multivariate Normal Distribution
Lesson 4.12 — Extensions — Multivariate Normal Distribution
Definition: (X,Y ) has the Bivariate Normal distribution if it has pdf
f(x, y) = C exp
−[z2X(x)− 2ρzX(x)zY (y) + z2Y (y)
]2(1− ρ2)
,
whereρ ≡ Corr(X,Y ), C ≡ 1
2πσXσY√
1− ρ2,
zX(x) ≡ x− µXσX
, and zY (y) ≡ y − µYσY
.
ISYE 6739 — Goldsman 8/5/20 89 / 108
![Page 497: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/497.jpg)
Extensions — Multivariate Normal Distribution
Lesson 4.12 — Extensions — Multivariate Normal Distribution
Definition: (X,Y ) has the Bivariate Normal distribution if it has pdf
f(x, y) = C exp
−[z2X(x)− 2ρzX(x)zY (y) + z2Y (y)
]2(1− ρ2)
,
whereρ ≡ Corr(X,Y ), C ≡ 1
2πσXσY√
1− ρ2,
zX(x) ≡ x− µXσX
, and zY (y) ≡ y − µYσY
.
ISYE 6739 — Goldsman 8/5/20 89 / 108
![Page 498: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/498.jpg)
Extensions — Multivariate Normal Distribution
Pretty nasty joint pdf, eh?
In fact, X ∼ Nor(µX , σ2X) and Y ∼ Nor(µY , σ2Y ).
Example: (X,Y ) could be a person’s (height,weight). The two quantitiesare marginally normal, but positively correlated.
If you want to calculate bivariate normal probabilities, you’ll need to evaluatequantities like
P (a < X < b, c < Y < d) =
∫ d
c
∫ b
af(x, y) dx dy,
which will probably require numerical integration techniques.
ISYE 6739 — Goldsman 8/5/20 90 / 108
![Page 499: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/499.jpg)
Extensions — Multivariate Normal Distribution
Pretty nasty joint pdf, eh?
In fact, X ∼ Nor(µX , σ2X) and Y ∼ Nor(µY , σ2Y ).
Example: (X,Y ) could be a person’s (height,weight). The two quantitiesare marginally normal, but positively correlated.
If you want to calculate bivariate normal probabilities, you’ll need to evaluatequantities like
P (a < X < b, c < Y < d) =
∫ d
c
∫ b
af(x, y) dx dy,
which will probably require numerical integration techniques.
ISYE 6739 — Goldsman 8/5/20 90 / 108
![Page 500: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/500.jpg)
Extensions — Multivariate Normal Distribution
Pretty nasty joint pdf, eh?
In fact, X ∼ Nor(µX , σ2X) and Y ∼ Nor(µY , σ2Y ).
Example: (X,Y ) could be a person’s (height,weight). The two quantitiesare marginally normal, but positively correlated.
If you want to calculate bivariate normal probabilities, you’ll need to evaluatequantities like
P (a < X < b, c < Y < d) =
∫ d
c
∫ b
af(x, y) dx dy,
which will probably require numerical integration techniques.
ISYE 6739 — Goldsman 8/5/20 90 / 108
![Page 501: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/501.jpg)
Extensions — Multivariate Normal Distribution
Pretty nasty joint pdf, eh?
In fact, X ∼ Nor(µX , σ2X) and Y ∼ Nor(µY , σ2Y ).
Example: (X,Y ) could be a person’s (height,weight). The two quantitiesare marginally normal, but positively correlated.
If you want to calculate bivariate normal probabilities, you’ll need to evaluatequantities like
P (a < X < b, c < Y < d) =
∫ d
c
∫ b
af(x, y) dx dy,
which will probably require numerical integration techniques.
ISYE 6739 — Goldsman 8/5/20 90 / 108
![Page 502: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/502.jpg)
Extensions — Multivariate Normal Distribution
Pretty nasty joint pdf, eh?
In fact, X ∼ Nor(µX , σ2X) and Y ∼ Nor(µY , σ2Y ).
Example: (X,Y ) could be a person’s (height,weight). The two quantitiesare marginally normal, but positively correlated.
If you want to calculate bivariate normal probabilities, you’ll need to evaluatequantities like
P (a < X < b, c < Y < d) =
∫ d
c
∫ b
af(x, y) dx dy,
which will probably require numerical integration techniques.
ISYE 6739 — Goldsman 8/5/20 90 / 108
![Page 503: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/503.jpg)
Extensions — Multivariate Normal Distribution
Fun Fact (which will come up later when we discuss regression): Theconditional distribution of Y given that X = x is also normal. In particular,
Y |X = x ∼ Nor(µY + ρ(σY /σX)(x− µX), σ2Y (1− ρ2)
).
Information about X helps to update the distribution of Y .
Example: Consider students at a university. Let X be their combined SATscores (Math and Verbal), and Y their freshman GPA (out of 4).Suppose a study reveals that
µX = 1300, µY = 2.3,
σ2X = 6400, σ2Y = 0.25, ρ = 0.6.
Find P (Y ≥ 2|X = 900).
ISYE 6739 — Goldsman 8/5/20 91 / 108
![Page 504: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/504.jpg)
Extensions — Multivariate Normal Distribution
Fun Fact (which will come up later when we discuss regression): Theconditional distribution of Y given that X = x is also normal. In particular,
Y |X = x ∼ Nor(µY + ρ(σY /σX)(x− µX), σ2Y (1− ρ2)
).
Information about X helps to update the distribution of Y .
Example: Consider students at a university. Let X be their combined SATscores (Math and Verbal), and Y their freshman GPA (out of 4).Suppose a study reveals that
µX = 1300, µY = 2.3,
σ2X = 6400, σ2Y = 0.25, ρ = 0.6.
Find P (Y ≥ 2|X = 900).
ISYE 6739 — Goldsman 8/5/20 91 / 108
![Page 505: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/505.jpg)
Extensions — Multivariate Normal Distribution
Fun Fact (which will come up later when we discuss regression): Theconditional distribution of Y given that X = x is also normal. In particular,
Y |X = x ∼ Nor(µY + ρ(σY /σX)(x− µX), σ2Y (1− ρ2)
).
Information about X helps to update the distribution of Y .
Example: Consider students at a university. Let X be their combined SATscores (Math and Verbal), and Y their freshman GPA (out of 4).Suppose a study reveals that
µX = 1300, µY = 2.3,
σ2X = 6400, σ2Y = 0.25, ρ = 0.6.
Find P (Y ≥ 2|X = 900).
ISYE 6739 — Goldsman 8/5/20 91 / 108
![Page 506: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/506.jpg)
Extensions — Multivariate Normal Distribution
Fun Fact (which will come up later when we discuss regression): Theconditional distribution of Y given that X = x is also normal. In particular,
Y |X = x ∼ Nor(µY + ρ(σY /σX)(x− µX), σ2Y (1− ρ2)
).
Information about X helps to update the distribution of Y .
Example: Consider students at a university. Let X be their combined SATscores (Math and Verbal), and Y their freshman GPA (out of 4).
Suppose a study reveals that
µX = 1300, µY = 2.3,
σ2X = 6400, σ2Y = 0.25, ρ = 0.6.
Find P (Y ≥ 2|X = 900).
ISYE 6739 — Goldsman 8/5/20 91 / 108
![Page 507: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/507.jpg)
Extensions — Multivariate Normal Distribution
Fun Fact (which will come up later when we discuss regression): Theconditional distribution of Y given that X = x is also normal. In particular,
Y |X = x ∼ Nor(µY + ρ(σY /σX)(x− µX), σ2Y (1− ρ2)
).
Information about X helps to update the distribution of Y .
Example: Consider students at a university. Let X be their combined SATscores (Math and Verbal), and Y their freshman GPA (out of 4).Suppose a study reveals that
µX = 1300, µY = 2.3,
σ2X = 6400, σ2Y = 0.25, ρ = 0.6.
Find P (Y ≥ 2|X = 900).
ISYE 6739 — Goldsman 8/5/20 91 / 108
![Page 508: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/508.jpg)
Extensions — Multivariate Normal Distribution
Fun Fact (which will come up later when we discuss regression): Theconditional distribution of Y given that X = x is also normal. In particular,
Y |X = x ∼ Nor(µY + ρ(σY /σX)(x− µX), σ2Y (1− ρ2)
).
Information about X helps to update the distribution of Y .
Example: Consider students at a university. Let X be their combined SATscores (Math and Verbal), and Y their freshman GPA (out of 4).Suppose a study reveals that
µX = 1300, µY = 2.3,
σ2X = 6400, σ2Y = 0.25, ρ = 0.6.
Find P (Y ≥ 2|X = 900).
ISYE 6739 — Goldsman 8/5/20 91 / 108
![Page 509: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/509.jpg)
Extensions — Multivariate Normal Distribution
Fun Fact (which will come up later when we discuss regression): Theconditional distribution of Y given that X = x is also normal. In particular,
Y |X = x ∼ Nor(µY + ρ(σY /σX)(x− µX), σ2Y (1− ρ2)
).
Information about X helps to update the distribution of Y .
Example: Consider students at a university. Let X be their combined SATscores (Math and Verbal), and Y their freshman GPA (out of 4).Suppose a study reveals that
µX = 1300, µY = 2.3,
σ2X = 6400, σ2Y = 0.25, ρ = 0.6.
Find P (Y ≥ 2|X = 900).
ISYE 6739 — Goldsman 8/5/20 91 / 108
![Page 510: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/510.jpg)
Extensions — Multivariate Normal Distribution
First,
E[Y |X = 900] = µY + ρ(σY /σX)(x− µX)
= 2.3 + (0.6)(√
0.25/6400 )(900− 1300) = 0.8,
indicating that the expected GPA of a kid with 900 SAT’s will be 0.8.
Second,Var(Y |X = 900) = σ2Y (1− ρ2) = 0.16.
Thus,Y |X = 900 ∼ Nor(0.8, 0.16).
Now we can calculate
P (Y ≥ 2|X = 900) = P(Z ≥ 2− 0.8√
0.16
)= 1− Φ(3) = 0.0013.
This guy doesn’t have much chance of having a good GPA. 2
ISYE 6739 — Goldsman 8/5/20 92 / 108
![Page 511: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/511.jpg)
Extensions — Multivariate Normal Distribution
First,
E[Y |X = 900] = µY + ρ(σY /σX)(x− µX)
= 2.3 + (0.6)(√
0.25/6400 )(900− 1300) = 0.8,
indicating that the expected GPA of a kid with 900 SAT’s will be 0.8.
Second,Var(Y |X = 900) = σ2Y (1− ρ2) = 0.16.
Thus,Y |X = 900 ∼ Nor(0.8, 0.16).
Now we can calculate
P (Y ≥ 2|X = 900) = P(Z ≥ 2− 0.8√
0.16
)= 1− Φ(3) = 0.0013.
This guy doesn’t have much chance of having a good GPA. 2
ISYE 6739 — Goldsman 8/5/20 92 / 108
![Page 512: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/512.jpg)
Extensions — Multivariate Normal Distribution
First,
E[Y |X = 900] = µY + ρ(σY /σX)(x− µX)
= 2.3 + (0.6)(√
0.25/6400 )(900− 1300) = 0.8,
indicating that the expected GPA of a kid with 900 SAT’s will be 0.8.
Second,Var(Y |X = 900) = σ2Y (1− ρ2) = 0.16.
Thus,Y |X = 900 ∼ Nor(0.8, 0.16).
Now we can calculate
P (Y ≥ 2|X = 900) = P(Z ≥ 2− 0.8√
0.16
)= 1− Φ(3) = 0.0013.
This guy doesn’t have much chance of having a good GPA. 2
ISYE 6739 — Goldsman 8/5/20 92 / 108
![Page 513: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/513.jpg)
Extensions — Multivariate Normal Distribution
First,
E[Y |X = 900] = µY + ρ(σY /σX)(x− µX)
= 2.3 + (0.6)(√
0.25/6400 )(900− 1300) = 0.8,
indicating that the expected GPA of a kid with 900 SAT’s will be 0.8.
Second,Var(Y |X = 900) = σ2Y (1− ρ2) = 0.16.
Thus,Y |X = 900 ∼ Nor(0.8, 0.16).
Now we can calculate
P (Y ≥ 2|X = 900) = P(Z ≥ 2− 0.8√
0.16
)= 1− Φ(3) = 0.0013.
This guy doesn’t have much chance of having a good GPA. 2
ISYE 6739 — Goldsman 8/5/20 92 / 108
![Page 514: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/514.jpg)
Extensions — Multivariate Normal Distribution
First,
E[Y |X = 900] = µY + ρ(σY /σX)(x− µX)
= 2.3 + (0.6)(√
0.25/6400 )(900− 1300) = 0.8,
indicating that the expected GPA of a kid with 900 SAT’s will be 0.8.
Second,Var(Y |X = 900) = σ2Y (1− ρ2) = 0.16.
Thus,Y |X = 900 ∼ Nor(0.8, 0.16).
Now we can calculate
P (Y ≥ 2|X = 900) = P(Z ≥ 2− 0.8√
0.16
)= 1− Φ(3) = 0.0013.
This guy doesn’t have much chance of having a good GPA. 2
ISYE 6739 — Goldsman 8/5/20 92 / 108
![Page 515: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/515.jpg)
Extensions — Multivariate Normal Distribution
First,
E[Y |X = 900] = µY + ρ(σY /σX)(x− µX)
= 2.3 + (0.6)(√
0.25/6400 )(900− 1300) = 0.8,
indicating that the expected GPA of a kid with 900 SAT’s will be 0.8.
Second,Var(Y |X = 900) = σ2Y (1− ρ2) = 0.16.
Thus,Y |X = 900 ∼ Nor(0.8, 0.16).
Now we can calculate
P (Y ≥ 2|X = 900) = P(Z ≥ 2− 0.8√
0.16
)
= 1− Φ(3) = 0.0013.
This guy doesn’t have much chance of having a good GPA. 2
ISYE 6739 — Goldsman 8/5/20 92 / 108
![Page 516: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/516.jpg)
Extensions — Multivariate Normal Distribution
First,
E[Y |X = 900] = µY + ρ(σY /σX)(x− µX)
= 2.3 + (0.6)(√
0.25/6400 )(900− 1300) = 0.8,
indicating that the expected GPA of a kid with 900 SAT’s will be 0.8.
Second,Var(Y |X = 900) = σ2Y (1− ρ2) = 0.16.
Thus,Y |X = 900 ∼ Nor(0.8, 0.16).
Now we can calculate
P (Y ≥ 2|X = 900) = P(Z ≥ 2− 0.8√
0.16
)= 1− Φ(3) = 0.0013.
This guy doesn’t have much chance of having a good GPA. 2
ISYE 6739 — Goldsman 8/5/20 92 / 108
![Page 517: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/517.jpg)
Extensions — Multivariate Normal Distribution
First,
E[Y |X = 900] = µY + ρ(σY /σX)(x− µX)
= 2.3 + (0.6)(√
0.25/6400 )(900− 1300) = 0.8,
indicating that the expected GPA of a kid with 900 SAT’s will be 0.8.
Second,Var(Y |X = 900) = σ2Y (1− ρ2) = 0.16.
Thus,Y |X = 900 ∼ Nor(0.8, 0.16).
Now we can calculate
P (Y ≥ 2|X = 900) = P(Z ≥ 2− 0.8√
0.16
)= 1− Φ(3) = 0.0013.
This guy doesn’t have much chance of having a good GPA. 2
ISYE 6739 — Goldsman 8/5/20 92 / 108
![Page 518: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/518.jpg)
Extensions — Multivariate Normal Distribution
The bivariate normal distribution is easily generalized to the multivariate case.
Honors Definition: The random vectorX = (X1, . . . , Xk)T has the
multivariate normal distribution with mean vector µ = (µ1, . . . , µk)T
and k × k covariance matrix Σ = (σij) if it has multivariate pdf
f(x) =1
(2π)k/2|Σ|1/2exp
{−(x− µ)TΣ−1(x− µ)
2
}, x ∈ Rk,
where |Σ| and Σ−1 are the determinant and inverse of Σ, respectively.
It turns out that
E[Xi] = µi, Var(Xi) = σii, Cov(Xi, Xj) = σij .
Notation: X ∼ Nork(µ, Σ).
ISYE 6739 — Goldsman 8/5/20 93 / 108
![Page 519: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/519.jpg)
Extensions — Multivariate Normal Distribution
The bivariate normal distribution is easily generalized to the multivariate case.
Honors Definition: The random vectorX = (X1, . . . , Xk)T has the
multivariate normal distribution with mean vector µ = (µ1, . . . , µk)T
and k × k covariance matrix Σ = (σij) if it has multivariate pdf
f(x) =1
(2π)k/2|Σ|1/2exp
{−(x− µ)TΣ−1(x− µ)
2
}, x ∈ Rk,
where |Σ| and Σ−1 are the determinant and inverse of Σ, respectively.
It turns out that
E[Xi] = µi, Var(Xi) = σii, Cov(Xi, Xj) = σij .
Notation: X ∼ Nork(µ, Σ).
ISYE 6739 — Goldsman 8/5/20 93 / 108
![Page 520: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/520.jpg)
Extensions — Multivariate Normal Distribution
The bivariate normal distribution is easily generalized to the multivariate case.
Honors Definition: The random vectorX = (X1, . . . , Xk)T has the
multivariate normal distribution with mean vector µ = (µ1, . . . , µk)T
and k × k covariance matrix Σ = (σij) if it has multivariate pdf
f(x) =1
(2π)k/2|Σ|1/2exp
{−(x− µ)TΣ−1(x− µ)
2
}, x ∈ Rk,
where |Σ| and Σ−1 are the determinant and inverse of Σ, respectively.
It turns out that
E[Xi] = µi, Var(Xi) = σii, Cov(Xi, Xj) = σij .
Notation: X ∼ Nork(µ, Σ).
ISYE 6739 — Goldsman 8/5/20 93 / 108
![Page 521: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/521.jpg)
Extensions — Multivariate Normal Distribution
The bivariate normal distribution is easily generalized to the multivariate case.
Honors Definition: The random vectorX = (X1, . . . , Xk)T has the
multivariate normal distribution with mean vector µ = (µ1, . . . , µk)T
and k × k covariance matrix Σ = (σij) if it has multivariate pdf
f(x) =1
(2π)k/2|Σ|1/2exp
{−(x− µ)TΣ−1(x− µ)
2
}, x ∈ Rk,
where |Σ| and Σ−1 are the determinant and inverse of Σ, respectively.
It turns out that
E[Xi] = µi, Var(Xi) = σii, Cov(Xi, Xj) = σij .
Notation: X ∼ Nork(µ, Σ).
ISYE 6739 — Goldsman 8/5/20 93 / 108
![Page 522: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/522.jpg)
Extensions — Multivariate Normal Distribution
The bivariate normal distribution is easily generalized to the multivariate case.
Honors Definition: The random vectorX = (X1, . . . , Xk)T has the
multivariate normal distribution with mean vector µ = (µ1, . . . , µk)T
and k × k covariance matrix Σ = (σij) if it has multivariate pdf
f(x) =1
(2π)k/2|Σ|1/2exp
{−(x− µ)TΣ−1(x− µ)
2
}, x ∈ Rk,
where |Σ| and Σ−1 are the determinant and inverse of Σ, respectively.
It turns out that
E[Xi] = µi, Var(Xi) = σii, Cov(Xi, Xj) = σij .
Notation: X ∼ Nork(µ, Σ).
ISYE 6739 — Goldsman 8/5/20 93 / 108
![Page 523: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/523.jpg)
Extensions — Multivariate Normal Distribution
The bivariate normal distribution is easily generalized to the multivariate case.
Honors Definition: The random vectorX = (X1, . . . , Xk)T has the
multivariate normal distribution with mean vector µ = (µ1, . . . , µk)T
and k × k covariance matrix Σ = (σij) if it has multivariate pdf
f(x) =1
(2π)k/2|Σ|1/2exp
{−(x− µ)TΣ−1(x− µ)
2
}, x ∈ Rk,
where |Σ| and Σ−1 are the determinant and inverse of Σ, respectively.
It turns out that
E[Xi] = µi, Var(Xi) = σii, Cov(Xi, Xj) = σij .
Notation: X ∼ Nork(µ, Σ).
ISYE 6739 — Goldsman 8/5/20 93 / 108
![Page 524: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/524.jpg)
Extensions — Lognormal Distribution
1 Bernoulli and Binomial Distributions
2 Hypergeometric Distribution3 Geometric and Negative Binomial Distributions
4 Poisson Distribution5 Uniform, Exponential, and Friends6 Other Continuous Distributions7 Normal Distribution: Basics8 Standard Normal Distribution9 Sample Mean of Normals
10 The Central Limit Theorem + Proof
11 Central Limit Theorem Examples
12 Extensions — Multivariate Normal Distribution13 Extensions — Lognormal Distribution
14 Computer Stuff
ISYE 6739 — Goldsman 8/5/20 94 / 108
![Page 525: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/525.jpg)
Extensions — Lognormal Distribution
Lesson 4.13 — Extensions — Lognormal Distribution
Definition: If Y ∼ Nor(µY , σ2Y ), then X ≡ eY has the Lognormaldistribution with parameters (µY , σ
2Y ). This distribution has tremendous
uses, e.g., in the pricing of certain stock options.
Turns Out: The pdf and moments of the lognormal are
f(x) =1
xσY√
2πexp
{− 1
2σ2Y[`n(x)− µY ]2
}, x > 0,
E[X] = exp
{µY +
σ2Y2
},
Var(X) = exp{
2µY + σ2Y}(
exp{σ2Y}− 1).
ISYE 6739 — Goldsman 8/5/20 95 / 108
![Page 526: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/526.jpg)
Extensions — Lognormal Distribution
Lesson 4.13 — Extensions — Lognormal Distribution
Definition: If Y ∼ Nor(µY , σ2Y ), then X ≡ eY has the Lognormaldistribution with parameters (µY , σ
2Y ). This distribution has tremendous
uses, e.g., in the pricing of certain stock options.
Turns Out: The pdf and moments of the lognormal are
f(x) =1
xσY√
2πexp
{− 1
2σ2Y[`n(x)− µY ]2
}, x > 0,
E[X] = exp
{µY +
σ2Y2
},
Var(X) = exp{
2µY + σ2Y}(
exp{σ2Y}− 1).
ISYE 6739 — Goldsman 8/5/20 95 / 108
![Page 527: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/527.jpg)
Extensions — Lognormal Distribution
Lesson 4.13 — Extensions — Lognormal Distribution
Definition: If Y ∼ Nor(µY , σ2Y ), then X ≡ eY has the Lognormaldistribution with parameters (µY , σ
2Y ). This distribution has tremendous
uses, e.g., in the pricing of certain stock options.
Turns Out: The pdf and moments of the lognormal are
f(x) =1
xσY√
2πexp
{− 1
2σ2Y[`n(x)− µY ]2
}, x > 0,
E[X] = exp
{µY +
σ2Y2
},
Var(X) = exp{
2µY + σ2Y}(
exp{σ2Y}− 1).
ISYE 6739 — Goldsman 8/5/20 95 / 108
![Page 528: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/528.jpg)
Extensions — Lognormal Distribution
Lesson 4.13 — Extensions — Lognormal Distribution
Definition: If Y ∼ Nor(µY , σ2Y ), then X ≡ eY has the Lognormaldistribution with parameters (µY , σ
2Y ). This distribution has tremendous
uses, e.g., in the pricing of certain stock options.
Turns Out: The pdf and moments of the lognormal are
f(x) =1
xσY√
2πexp
{− 1
2σ2Y[`n(x)− µY ]2
}, x > 0,
E[X] = exp
{µY +
σ2Y2
},
Var(X) = exp{
2µY + σ2Y}(
exp{σ2Y}− 1).
ISYE 6739 — Goldsman 8/5/20 95 / 108
![Page 529: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/529.jpg)
Extensions — Lognormal Distribution
Lesson 4.13 — Extensions — Lognormal Distribution
Definition: If Y ∼ Nor(µY , σ2Y ), then X ≡ eY has the Lognormaldistribution with parameters (µY , σ
2Y ). This distribution has tremendous
uses, e.g., in the pricing of certain stock options.
Turns Out: The pdf and moments of the lognormal are
f(x) =1
xσY√
2πexp
{− 1
2σ2Y[`n(x)− µY ]2
}, x > 0,
E[X] = exp
{µY +
σ2Y2
},
Var(X) = exp{
2µY + σ2Y}(
exp{σ2Y}− 1).
ISYE 6739 — Goldsman 8/5/20 95 / 108
![Page 530: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/530.jpg)
Extensions — Lognormal Distribution
Example: Suppose Y ∼ Nor(10, 4) and let X = eY . Then
P (X ≤ 1000) = P(Y ≤ `n(1000)
)= P
(Z ≤ `n(1000)− 10
2
)= Φ(−1.55) = 0.061. 2
Honors Example (How to Win a Nobel Prize:) It is well-known thatstock prices are closely related to the lognormal distribution. In fact, it’scommon to use the following model for a stock price at a fixed time t,
S(t) = S(0) exp
{(µ− σ2
2
)t+ σ
√tZ
}, t ≥ 0,
where µ is related to the “drift” of the stock price (i.e., the natural rate ofincrease), σ is its “volatility” (how much the stock bounces around), S(0) isthe initial price, and Z is a standard normal RV.
ISYE 6739 — Goldsman 8/5/20 96 / 108
![Page 531: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/531.jpg)
Extensions — Lognormal Distribution
Example: Suppose Y ∼ Nor(10, 4) and let X = eY . Then
P (X ≤ 1000) = P(Y ≤ `n(1000)
)
= P(Z ≤ `n(1000)− 10
2
)= Φ(−1.55) = 0.061. 2
Honors Example (How to Win a Nobel Prize:) It is well-known thatstock prices are closely related to the lognormal distribution. In fact, it’scommon to use the following model for a stock price at a fixed time t,
S(t) = S(0) exp
{(µ− σ2
2
)t+ σ
√tZ
}, t ≥ 0,
where µ is related to the “drift” of the stock price (i.e., the natural rate ofincrease), σ is its “volatility” (how much the stock bounces around), S(0) isthe initial price, and Z is a standard normal RV.
ISYE 6739 — Goldsman 8/5/20 96 / 108
![Page 532: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/532.jpg)
Extensions — Lognormal Distribution
Example: Suppose Y ∼ Nor(10, 4) and let X = eY . Then
P (X ≤ 1000) = P(Y ≤ `n(1000)
)= P
(Z ≤ `n(1000)− 10
2
)
= Φ(−1.55) = 0.061. 2
Honors Example (How to Win a Nobel Prize:) It is well-known thatstock prices are closely related to the lognormal distribution. In fact, it’scommon to use the following model for a stock price at a fixed time t,
S(t) = S(0) exp
{(µ− σ2
2
)t+ σ
√tZ
}, t ≥ 0,
where µ is related to the “drift” of the stock price (i.e., the natural rate ofincrease), σ is its “volatility” (how much the stock bounces around), S(0) isthe initial price, and Z is a standard normal RV.
ISYE 6739 — Goldsman 8/5/20 96 / 108
![Page 533: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/533.jpg)
Extensions — Lognormal Distribution
Example: Suppose Y ∼ Nor(10, 4) and let X = eY . Then
P (X ≤ 1000) = P(Y ≤ `n(1000)
)= P
(Z ≤ `n(1000)− 10
2
)= Φ(−1.55) = 0.061. 2
Honors Example (How to Win a Nobel Prize:) It is well-known thatstock prices are closely related to the lognormal distribution. In fact, it’scommon to use the following model for a stock price at a fixed time t,
S(t) = S(0) exp
{(µ− σ2
2
)t+ σ
√tZ
}, t ≥ 0,
where µ is related to the “drift” of the stock price (i.e., the natural rate ofincrease), σ is its “volatility” (how much the stock bounces around), S(0) isthe initial price, and Z is a standard normal RV.
ISYE 6739 — Goldsman 8/5/20 96 / 108
![Page 534: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/534.jpg)
Extensions — Lognormal Distribution
Example: Suppose Y ∼ Nor(10, 4) and let X = eY . Then
P (X ≤ 1000) = P(Y ≤ `n(1000)
)= P
(Z ≤ `n(1000)− 10
2
)= Φ(−1.55) = 0.061. 2
Honors Example (How to Win a Nobel Prize:) It is well-known thatstock prices are closely related to the lognormal distribution. In fact, it’scommon to use the following model for a stock price at a fixed time t,
S(t) = S(0) exp
{(µ− σ2
2
)t+ σ
√tZ
}, t ≥ 0,
where µ is related to the “drift” of the stock price (i.e., the natural rate ofincrease), σ is its “volatility” (how much the stock bounces around), S(0) isthe initial price, and Z is a standard normal RV.
ISYE 6739 — Goldsman 8/5/20 96 / 108
![Page 535: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/535.jpg)
Extensions — Lognormal Distribution
Example: Suppose Y ∼ Nor(10, 4) and let X = eY . Then
P (X ≤ 1000) = P(Y ≤ `n(1000)
)= P
(Z ≤ `n(1000)− 10
2
)= Φ(−1.55) = 0.061. 2
Honors Example (How to Win a Nobel Prize:) It is well-known thatstock prices are closely related to the lognormal distribution. In fact, it’scommon to use the following model for a stock price at a fixed time t,
S(t) = S(0) exp
{(µ− σ2
2
)t+ σ
√tZ
}, t ≥ 0,
where µ is related to the “drift” of the stock price (i.e., the natural rate ofincrease), σ is its “volatility” (how much the stock bounces around), S(0) isthe initial price, and Z is a standard normal RV.
ISYE 6739 — Goldsman 8/5/20 96 / 108
![Page 536: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/536.jpg)
Extensions — Lognormal Distribution
Example: Suppose Y ∼ Nor(10, 4) and let X = eY . Then
P (X ≤ 1000) = P(Y ≤ `n(1000)
)= P
(Z ≤ `n(1000)− 10
2
)= Φ(−1.55) = 0.061. 2
Honors Example (How to Win a Nobel Prize:) It is well-known thatstock prices are closely related to the lognormal distribution. In fact, it’scommon to use the following model for a stock price at a fixed time t,
S(t) = S(0) exp
{(µ− σ2
2
)t+ σ
√tZ
}, t ≥ 0,
where µ is related to the “drift” of the stock price (i.e., the natural rate ofincrease), σ is its “volatility” (how much the stock bounces around), S(0) isthe initial price, and Z is a standard normal RV.
ISYE 6739 — Goldsman 8/5/20 96 / 108
![Page 537: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/537.jpg)
Extensions — Lognormal Distribution
An active area of finance is to estimate option prices.
For example, aso-called European call option C permits its owner, who pays an up-frontfee for the privilege, to purchase the stock at a pre-agreed strike price k, at apre-determined expiry date T .
For instance, suppose IBM is currently selling for $100 a share. If I think thatthe stock will go up in value, I may want to pay $3/share now for the right tobuy IBM at $105 three months from now.
If IBM is worth $120 three months from now, I’ll be able to buy it foronly $105, and will have made a profit of $120− $105− $3 = $12.
If IBM is selling for $107 three months hence, I can still buy it for $105,and will lose $1 (recouping $2 from my original option purchase).
If IBM is selling for $95, then I won’t exercise my option, and will walkaway with my tail between my legs having lost my original $3.
ISYE 6739 — Goldsman 8/5/20 97 / 108
![Page 538: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/538.jpg)
Extensions — Lognormal Distribution
An active area of finance is to estimate option prices. For example, aso-called European call option C permits its owner, who pays an up-frontfee for the privilege, to purchase the stock at a pre-agreed strike price k, at apre-determined expiry date T .
For instance, suppose IBM is currently selling for $100 a share. If I think thatthe stock will go up in value, I may want to pay $3/share now for the right tobuy IBM at $105 three months from now.
If IBM is worth $120 three months from now, I’ll be able to buy it foronly $105, and will have made a profit of $120− $105− $3 = $12.
If IBM is selling for $107 three months hence, I can still buy it for $105,and will lose $1 (recouping $2 from my original option purchase).
If IBM is selling for $95, then I won’t exercise my option, and will walkaway with my tail between my legs having lost my original $3.
ISYE 6739 — Goldsman 8/5/20 97 / 108
![Page 539: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/539.jpg)
Extensions — Lognormal Distribution
An active area of finance is to estimate option prices. For example, aso-called European call option C permits its owner, who pays an up-frontfee for the privilege, to purchase the stock at a pre-agreed strike price k, at apre-determined expiry date T .
For instance, suppose IBM is currently selling for $100 a share. If I think thatthe stock will go up in value, I may want to pay $3/share now for the right tobuy IBM at $105 three months from now.
If IBM is worth $120 three months from now, I’ll be able to buy it foronly $105, and will have made a profit of $120− $105− $3 = $12.
If IBM is selling for $107 three months hence, I can still buy it for $105,and will lose $1 (recouping $2 from my original option purchase).
If IBM is selling for $95, then I won’t exercise my option, and will walkaway with my tail between my legs having lost my original $3.
ISYE 6739 — Goldsman 8/5/20 97 / 108
![Page 540: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/540.jpg)
Extensions — Lognormal Distribution
An active area of finance is to estimate option prices. For example, aso-called European call option C permits its owner, who pays an up-frontfee for the privilege, to purchase the stock at a pre-agreed strike price k, at apre-determined expiry date T .
For instance, suppose IBM is currently selling for $100 a share. If I think thatthe stock will go up in value, I may want to pay $3/share now for the right tobuy IBM at $105 three months from now.
If IBM is worth $120 three months from now, I’ll be able to buy it foronly $105, and will have made a profit of $120− $105− $3 = $12.
If IBM is selling for $107 three months hence, I can still buy it for $105,and will lose $1 (recouping $2 from my original option purchase).
If IBM is selling for $95, then I won’t exercise my option, and will walkaway with my tail between my legs having lost my original $3.
ISYE 6739 — Goldsman 8/5/20 97 / 108
![Page 541: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/541.jpg)
Extensions — Lognormal Distribution
An active area of finance is to estimate option prices. For example, aso-called European call option C permits its owner, who pays an up-frontfee for the privilege, to purchase the stock at a pre-agreed strike price k, at apre-determined expiry date T .
For instance, suppose IBM is currently selling for $100 a share. If I think thatthe stock will go up in value, I may want to pay $3/share now for the right tobuy IBM at $105 three months from now.
If IBM is worth $120 three months from now, I’ll be able to buy it foronly $105, and will have made a profit of $120− $105− $3 = $12.
If IBM is selling for $107 three months hence, I can still buy it for $105,and will lose $1 (recouping $2 from my original option purchase).
If IBM is selling for $95, then I won’t exercise my option, and will walkaway with my tail between my legs having lost my original $3.
ISYE 6739 — Goldsman 8/5/20 97 / 108
![Page 542: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/542.jpg)
Extensions — Lognormal Distribution
An active area of finance is to estimate option prices. For example, aso-called European call option C permits its owner, who pays an up-frontfee for the privilege, to purchase the stock at a pre-agreed strike price k, at apre-determined expiry date T .
For instance, suppose IBM is currently selling for $100 a share. If I think thatthe stock will go up in value, I may want to pay $3/share now for the right tobuy IBM at $105 three months from now.
If IBM is worth $120 three months from now, I’ll be able to buy it foronly $105, and will have made a profit of $120− $105− $3 = $12.
If IBM is selling for $107 three months hence, I can still buy it for $105,and will lose $1 (recouping $2 from my original option purchase).
If IBM is selling for $95, then I won’t exercise my option, and will walkaway with my tail between my legs having lost my original $3.
ISYE 6739 — Goldsman 8/5/20 97 / 108
![Page 543: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/543.jpg)
Extensions — Lognormal Distribution
So what is the option worth (and what should I pay for it)? Its expected valueis given by
E[C] = e−rTE[(S(T )− k
)+],
where
x+ ≡ max{0, x}.r, the “risk-free” interest rate (e.g., what you can get from a U.S.Treasury bond). This is used instead of the drift µ.
The term e−rT denotes the time-value of money, i.e., a depreciation termcorresponding to the interest I could’ve made had I used my money tobuy a Treasury note.
Black and Scholes won a Nobel Prize for calculating E[C]. We’ll get thesame answer via a different method — but, alas, no Nobel Prize.
ISYE 6739 — Goldsman 8/5/20 98 / 108
![Page 544: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/544.jpg)
Extensions — Lognormal Distribution
So what is the option worth (and what should I pay for it)? Its expected valueis given by
E[C] = e−rTE[(S(T )− k
)+],
where
x+ ≡ max{0, x}.r, the “risk-free” interest rate (e.g., what you can get from a U.S.Treasury bond). This is used instead of the drift µ.
The term e−rT denotes the time-value of money, i.e., a depreciation termcorresponding to the interest I could’ve made had I used my money tobuy a Treasury note.
Black and Scholes won a Nobel Prize for calculating E[C]. We’ll get thesame answer via a different method — but, alas, no Nobel Prize.
ISYE 6739 — Goldsman 8/5/20 98 / 108
![Page 545: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/545.jpg)
Extensions — Lognormal Distribution
So what is the option worth (and what should I pay for it)? Its expected valueis given by
E[C] = e−rTE[(S(T )− k
)+],
where
x+ ≡ max{0, x}.
r, the “risk-free” interest rate (e.g., what you can get from a U.S.Treasury bond). This is used instead of the drift µ.
The term e−rT denotes the time-value of money, i.e., a depreciation termcorresponding to the interest I could’ve made had I used my money tobuy a Treasury note.
Black and Scholes won a Nobel Prize for calculating E[C]. We’ll get thesame answer via a different method — but, alas, no Nobel Prize.
ISYE 6739 — Goldsman 8/5/20 98 / 108
![Page 546: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/546.jpg)
Extensions — Lognormal Distribution
So what is the option worth (and what should I pay for it)? Its expected valueis given by
E[C] = e−rTE[(S(T )− k
)+],
where
x+ ≡ max{0, x}.r, the “risk-free” interest rate (e.g., what you can get from a U.S.Treasury bond). This is used instead of the drift µ.
The term e−rT denotes the time-value of money, i.e., a depreciation termcorresponding to the interest I could’ve made had I used my money tobuy a Treasury note.
Black and Scholes won a Nobel Prize for calculating E[C]. We’ll get thesame answer via a different method — but, alas, no Nobel Prize.
ISYE 6739 — Goldsman 8/5/20 98 / 108
![Page 547: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/547.jpg)
Extensions — Lognormal Distribution
So what is the option worth (and what should I pay for it)? Its expected valueis given by
E[C] = e−rTE[(S(T )− k
)+],
where
x+ ≡ max{0, x}.r, the “risk-free” interest rate (e.g., what you can get from a U.S.Treasury bond). This is used instead of the drift µ.
The term e−rT denotes the time-value of money, i.e., a depreciation termcorresponding to the interest I could’ve made had I used my money tobuy a Treasury note.
Black and Scholes won a Nobel Prize for calculating E[C]. We’ll get thesame answer via a different method — but, alas, no Nobel Prize.
ISYE 6739 — Goldsman 8/5/20 98 / 108
![Page 548: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/548.jpg)
Extensions — Lognormal Distribution
So what is the option worth (and what should I pay for it)? Its expected valueis given by
E[C] = e−rTE[(S(T )− k
)+],
where
x+ ≡ max{0, x}.r, the “risk-free” interest rate (e.g., what you can get from a U.S.Treasury bond). This is used instead of the drift µ.
The term e−rT denotes the time-value of money, i.e., a depreciation termcorresponding to the interest I could’ve made had I used my money tobuy a Treasury note.
Black and Scholes won a Nobel Prize for calculating E[C]. We’ll get thesame answer via a different method — but, alas, no Nobel Prize.
ISYE 6739 — Goldsman 8/5/20 98 / 108
![Page 549: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/549.jpg)
Extensions — Lognormal Distribution
E[C] = e−rTE
[S(0) exp
{(r − σ2
2
)T + σ
√TZ
}− k]+
= e−rT∫ ∞−∞
[S(0) exp
{(r − σ2
2
)T + σ
√Tz
}− k]+
φ(z) dz
(via the standard conditioning argument)
= S(0)Φ(b+ σ√T )− ke−rTΦ(b) (after lots of algebra),
where φ(·) and Φ(·) are the Nor(0,1) pdf and cdf, respectively, and
b ≡rT − σ2T
2 − `n(k/S(0))
σ√T
.
There are many generalizations of this problem that are used in practicalfinance problems, but this is the starting point. Meanwhile, get your tickets toNorway or Sweden or wherever they give out the Nobel!
ISYE 6739 — Goldsman 8/5/20 99 / 108
![Page 550: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/550.jpg)
Extensions — Lognormal Distribution
E[C] = e−rTE
[S(0) exp
{(r − σ2
2
)T + σ
√TZ
}− k]+
= e−rT∫ ∞−∞
[S(0) exp
{(r − σ2
2
)T + σ
√Tz
}− k]+
φ(z) dz
(via the standard conditioning argument)
= S(0)Φ(b+ σ√T )− ke−rTΦ(b) (after lots of algebra),
where φ(·) and Φ(·) are the Nor(0,1) pdf and cdf, respectively, and
b ≡rT − σ2T
2 − `n(k/S(0))
σ√T
.
There are many generalizations of this problem that are used in practicalfinance problems, but this is the starting point. Meanwhile, get your tickets toNorway or Sweden or wherever they give out the Nobel!
ISYE 6739 — Goldsman 8/5/20 99 / 108
![Page 551: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/551.jpg)
Extensions — Lognormal Distribution
E[C] = e−rTE
[S(0) exp
{(r − σ2
2
)T + σ
√TZ
}− k]+
= e−rT∫ ∞−∞
[S(0) exp
{(r − σ2
2
)T + σ
√Tz
}− k]+
φ(z) dz
(via the standard conditioning argument)
= S(0)Φ(b+ σ√T )− ke−rTΦ(b) (after lots of algebra),
where φ(·) and Φ(·) are the Nor(0,1) pdf and cdf, respectively, and
b ≡rT − σ2T
2 − `n(k/S(0))
σ√T
.
There are many generalizations of this problem that are used in practicalfinance problems, but this is the starting point. Meanwhile, get your tickets toNorway or Sweden or wherever they give out the Nobel!
ISYE 6739 — Goldsman 8/5/20 99 / 108
![Page 552: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/552.jpg)
Extensions — Lognormal Distribution
E[C] = e−rTE
[S(0) exp
{(r − σ2
2
)T + σ
√TZ
}− k]+
= e−rT∫ ∞−∞
[S(0) exp
{(r − σ2
2
)T + σ
√Tz
}− k]+
φ(z) dz
(via the standard conditioning argument)
= S(0)Φ(b+ σ√T )− ke−rTΦ(b) (after lots of algebra),
where φ(·) and Φ(·) are the Nor(0,1) pdf and cdf, respectively, and
b ≡rT − σ2T
2 − `n(k/S(0))
σ√T
.
There are many generalizations of this problem that are used in practicalfinance problems, but this is the starting point. Meanwhile, get your tickets toNorway or Sweden or wherever they give out the Nobel!
ISYE 6739 — Goldsman 8/5/20 99 / 108
![Page 553: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/553.jpg)
Extensions — Lognormal Distribution
E[C] = e−rTE
[S(0) exp
{(r − σ2
2
)T + σ
√TZ
}− k]+
= e−rT∫ ∞−∞
[S(0) exp
{(r − σ2
2
)T + σ
√Tz
}− k]+
φ(z) dz
(via the standard conditioning argument)
= S(0)Φ(b+ σ√T )− ke−rTΦ(b) (after lots of algebra),
where φ(·) and Φ(·) are the Nor(0,1) pdf and cdf, respectively, and
b ≡rT − σ2T
2 − `n(k/S(0))
σ√T
.
There are many generalizations of this problem that are used in practicalfinance problems, but this is the starting point. Meanwhile, get your tickets toNorway or Sweden or wherever they give out the Nobel!
ISYE 6739 — Goldsman 8/5/20 99 / 108
![Page 554: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/554.jpg)
Extensions — Lognormal Distribution
E[C] = e−rTE
[S(0) exp
{(r − σ2
2
)T + σ
√TZ
}− k]+
= e−rT∫ ∞−∞
[S(0) exp
{(r − σ2
2
)T + σ
√Tz
}− k]+
φ(z) dz
(via the standard conditioning argument)
= S(0)Φ(b+ σ√T )− ke−rTΦ(b) (after lots of algebra),
where φ(·) and Φ(·) are the Nor(0,1) pdf and cdf, respectively, and
b ≡rT − σ2T
2 − `n(k/S(0))
σ√T
.
There are many generalizations of this problem that are used in practicalfinance problems, but this is the starting point. Meanwhile, get your tickets toNorway or Sweden or wherever they give out the Nobel!
ISYE 6739 — Goldsman 8/5/20 99 / 108
![Page 555: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/555.jpg)
Computer Stuff
1 Bernoulli and Binomial Distributions
2 Hypergeometric Distribution3 Geometric and Negative Binomial Distributions
4 Poisson Distribution5 Uniform, Exponential, and Friends6 Other Continuous Distributions7 Normal Distribution: Basics8 Standard Normal Distribution9 Sample Mean of Normals
10 The Central Limit Theorem + Proof
11 Central Limit Theorem Examples
12 Extensions — Multivariate Normal Distribution13 Extensions — Lognormal Distribution
14 Computer Stuff
ISYE 6739 — Goldsman 8/5/20 100 / 108
![Page 556: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/556.jpg)
Computer Stuff
Lesson 4.14 — Computer Stuff
Evaluating pmf’s / pdf’s and cdf’s
We can use various computer packages such as Excel, Minitab, R, SAS, etc.,to calculate pmf’s / pdf’s and cdf’s for a variety of common distributions. Forinstance, in Excel, we find the functions:
BINOMDIST = Binomial distributionEXPONDIST = ExponentialNEGBINOMDIST = Negative BinomialNORMDIST and NORMSDIST = Normal and Standard NormalPOISSON = Poisson
Functions such as NORMSINV and TINV can calculate the inverses of thestandard normal and t distributions, respectively.
ISYE 6739 — Goldsman 8/5/20 101 / 108
![Page 557: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/557.jpg)
Computer Stuff
Lesson 4.14 — Computer Stuff
Evaluating pmf’s / pdf’s and cdf’s
We can use various computer packages such as Excel, Minitab, R, SAS, etc.,to calculate pmf’s / pdf’s and cdf’s for a variety of common distributions. Forinstance, in Excel, we find the functions:
BINOMDIST = Binomial distributionEXPONDIST = ExponentialNEGBINOMDIST = Negative BinomialNORMDIST and NORMSDIST = Normal and Standard NormalPOISSON = Poisson
Functions such as NORMSINV and TINV can calculate the inverses of thestandard normal and t distributions, respectively.
ISYE 6739 — Goldsman 8/5/20 101 / 108
![Page 558: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/558.jpg)
Computer Stuff
Lesson 4.14 — Computer Stuff
Evaluating pmf’s / pdf’s and cdf’s
We can use various computer packages such as Excel, Minitab, R, SAS, etc.,to calculate pmf’s / pdf’s and cdf’s for a variety of common distributions. Forinstance, in Excel, we find the functions:
BINOMDIST = Binomial distributionEXPONDIST = ExponentialNEGBINOMDIST = Negative BinomialNORMDIST and NORMSDIST = Normal and Standard NormalPOISSON = Poisson
Functions such as NORMSINV and TINV can calculate the inverses of thestandard normal and t distributions, respectively.
ISYE 6739 — Goldsman 8/5/20 101 / 108
![Page 559: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/559.jpg)
Computer Stuff
Lesson 4.14 — Computer Stuff
Evaluating pmf’s / pdf’s and cdf’s
We can use various computer packages such as Excel, Minitab, R, SAS, etc.,to calculate pmf’s / pdf’s and cdf’s for a variety of common distributions. Forinstance, in Excel, we find the functions:
BINOMDIST = Binomial distribution
EXPONDIST = ExponentialNEGBINOMDIST = Negative BinomialNORMDIST and NORMSDIST = Normal and Standard NormalPOISSON = Poisson
Functions such as NORMSINV and TINV can calculate the inverses of thestandard normal and t distributions, respectively.
ISYE 6739 — Goldsman 8/5/20 101 / 108
![Page 560: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/560.jpg)
Computer Stuff
Lesson 4.14 — Computer Stuff
Evaluating pmf’s / pdf’s and cdf’s
We can use various computer packages such as Excel, Minitab, R, SAS, etc.,to calculate pmf’s / pdf’s and cdf’s for a variety of common distributions. Forinstance, in Excel, we find the functions:
BINOMDIST = Binomial distributionEXPONDIST = Exponential
NEGBINOMDIST = Negative BinomialNORMDIST and NORMSDIST = Normal and Standard NormalPOISSON = Poisson
Functions such as NORMSINV and TINV can calculate the inverses of thestandard normal and t distributions, respectively.
ISYE 6739 — Goldsman 8/5/20 101 / 108
![Page 561: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/561.jpg)
Computer Stuff
Lesson 4.14 — Computer Stuff
Evaluating pmf’s / pdf’s and cdf’s
We can use various computer packages such as Excel, Minitab, R, SAS, etc.,to calculate pmf’s / pdf’s and cdf’s for a variety of common distributions. Forinstance, in Excel, we find the functions:
BINOMDIST = Binomial distributionEXPONDIST = ExponentialNEGBINOMDIST = Negative Binomial
NORMDIST and NORMSDIST = Normal and Standard NormalPOISSON = Poisson
Functions such as NORMSINV and TINV can calculate the inverses of thestandard normal and t distributions, respectively.
ISYE 6739 — Goldsman 8/5/20 101 / 108
![Page 562: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/562.jpg)
Computer Stuff
Lesson 4.14 — Computer Stuff
Evaluating pmf’s / pdf’s and cdf’s
We can use various computer packages such as Excel, Minitab, R, SAS, etc.,to calculate pmf’s / pdf’s and cdf’s for a variety of common distributions. Forinstance, in Excel, we find the functions:
BINOMDIST = Binomial distributionEXPONDIST = ExponentialNEGBINOMDIST = Negative BinomialNORMDIST and NORMSDIST = Normal and Standard Normal
POISSON = Poisson
Functions such as NORMSINV and TINV can calculate the inverses of thestandard normal and t distributions, respectively.
ISYE 6739 — Goldsman 8/5/20 101 / 108
![Page 563: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/563.jpg)
Computer Stuff
Lesson 4.14 — Computer Stuff
Evaluating pmf’s / pdf’s and cdf’s
We can use various computer packages such as Excel, Minitab, R, SAS, etc.,to calculate pmf’s / pdf’s and cdf’s for a variety of common distributions. Forinstance, in Excel, we find the functions:
BINOMDIST = Binomial distributionEXPONDIST = ExponentialNEGBINOMDIST = Negative BinomialNORMDIST and NORMSDIST = Normal and Standard NormalPOISSON = Poisson
Functions such as NORMSINV and TINV can calculate the inverses of thestandard normal and t distributions, respectively.
ISYE 6739 — Goldsman 8/5/20 101 / 108
![Page 564: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/564.jpg)
Computer Stuff
Lesson 4.14 — Computer Stuff
Evaluating pmf’s / pdf’s and cdf’s
We can use various computer packages such as Excel, Minitab, R, SAS, etc.,to calculate pmf’s / pdf’s and cdf’s for a variety of common distributions. Forinstance, in Excel, we find the functions:
BINOMDIST = Binomial distributionEXPONDIST = ExponentialNEGBINOMDIST = Negative BinomialNORMDIST and NORMSDIST = Normal and Standard NormalPOISSON = Poisson
Functions such as NORMSINV and TINV can calculate the inverses of thestandard normal and t distributions, respectively.
ISYE 6739 — Goldsman 8/5/20 101 / 108
![Page 565: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/565.jpg)
Computer Stuff
Simulating Random Variables
Motivation: Simulations are used to evaluate a variety of real-worldprocesses that contain inherent randomness, e.g., queueing systems, inventorysystems, manufacturing systems, etc. In order to run simulations, you need togenerate various RVs, e.g., arrival times, service times, failure times, etc.
Examples: There are numerous ways to simulate RVs.
The Excel function RAND simulates a Unif(0,1) RV.
If U1, U2iid∼ Unif(0,1), then U1 + U2 is Triangular(0,1,2). This is simply
RAND()+RAND() in Excel.
The Inverse Transform Theorem gives (−1/λ)`n(U) ∼ Exp(λ).
If U1, U2, . . . , Ukiid∼ Unif(0,1), then
∑ki=1(−1/λ)`n(Ui) is Erlangk(λ)
because it is the sum of iid Exp(λ)’s.
It can be shown that X = d`n(U)/`n(1− p)e ∼ Geom(p), where d·e isthe “ceiling” (integer round-up) function.
ISYE 6739 — Goldsman 8/5/20 102 / 108
![Page 566: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/566.jpg)
Computer Stuff
Simulating Random Variables
Motivation: Simulations are used to evaluate a variety of real-worldprocesses that contain inherent randomness, e.g., queueing systems, inventorysystems, manufacturing systems, etc. In order to run simulations, you need togenerate various RVs, e.g., arrival times, service times, failure times, etc.
Examples: There are numerous ways to simulate RVs.
The Excel function RAND simulates a Unif(0,1) RV.
If U1, U2iid∼ Unif(0,1), then U1 + U2 is Triangular(0,1,2). This is simply
RAND()+RAND() in Excel.
The Inverse Transform Theorem gives (−1/λ)`n(U) ∼ Exp(λ).
If U1, U2, . . . , Ukiid∼ Unif(0,1), then
∑ki=1(−1/λ)`n(Ui) is Erlangk(λ)
because it is the sum of iid Exp(λ)’s.
It can be shown that X = d`n(U)/`n(1− p)e ∼ Geom(p), where d·e isthe “ceiling” (integer round-up) function.
ISYE 6739 — Goldsman 8/5/20 102 / 108
![Page 567: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/567.jpg)
Computer Stuff
Simulating Random Variables
Motivation: Simulations are used to evaluate a variety of real-worldprocesses that contain inherent randomness, e.g., queueing systems, inventorysystems, manufacturing systems, etc. In order to run simulations, you need togenerate various RVs, e.g., arrival times, service times, failure times, etc.
Examples: There are numerous ways to simulate RVs.
The Excel function RAND simulates a Unif(0,1) RV.
If U1, U2iid∼ Unif(0,1), then U1 + U2 is Triangular(0,1,2). This is simply
RAND()+RAND() in Excel.
The Inverse Transform Theorem gives (−1/λ)`n(U) ∼ Exp(λ).
If U1, U2, . . . , Ukiid∼ Unif(0,1), then
∑ki=1(−1/λ)`n(Ui) is Erlangk(λ)
because it is the sum of iid Exp(λ)’s.
It can be shown that X = d`n(U)/`n(1− p)e ∼ Geom(p), where d·e isthe “ceiling” (integer round-up) function.
ISYE 6739 — Goldsman 8/5/20 102 / 108
![Page 568: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/568.jpg)
Computer Stuff
Simulating Random Variables
Motivation: Simulations are used to evaluate a variety of real-worldprocesses that contain inherent randomness, e.g., queueing systems, inventorysystems, manufacturing systems, etc. In order to run simulations, you need togenerate various RVs, e.g., arrival times, service times, failure times, etc.
Examples: There are numerous ways to simulate RVs.
The Excel function RAND simulates a Unif(0,1) RV.
If U1, U2iid∼ Unif(0,1), then U1 + U2 is Triangular(0,1,2). This is simply
RAND()+RAND() in Excel.
The Inverse Transform Theorem gives (−1/λ)`n(U) ∼ Exp(λ).
If U1, U2, . . . , Ukiid∼ Unif(0,1), then
∑ki=1(−1/λ)`n(Ui) is Erlangk(λ)
because it is the sum of iid Exp(λ)’s.
It can be shown that X = d`n(U)/`n(1− p)e ∼ Geom(p), where d·e isthe “ceiling” (integer round-up) function.
ISYE 6739 — Goldsman 8/5/20 102 / 108
![Page 569: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/569.jpg)
Computer Stuff
Simulating Random Variables
Motivation: Simulations are used to evaluate a variety of real-worldprocesses that contain inherent randomness, e.g., queueing systems, inventorysystems, manufacturing systems, etc. In order to run simulations, you need togenerate various RVs, e.g., arrival times, service times, failure times, etc.
Examples: There are numerous ways to simulate RVs.
The Excel function RAND simulates a Unif(0,1) RV.
If U1, U2iid∼ Unif(0,1), then U1 + U2 is Triangular(0,1,2). This is simply
RAND()+RAND() in Excel.
The Inverse Transform Theorem gives (−1/λ)`n(U) ∼ Exp(λ).
If U1, U2, . . . , Ukiid∼ Unif(0,1), then
∑ki=1(−1/λ)`n(Ui) is Erlangk(λ)
because it is the sum of iid Exp(λ)’s.
It can be shown that X = d`n(U)/`n(1− p)e ∼ Geom(p), where d·e isthe “ceiling” (integer round-up) function.
ISYE 6739 — Goldsman 8/5/20 102 / 108
![Page 570: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/570.jpg)
Computer Stuff
Simulating Random Variables
Motivation: Simulations are used to evaluate a variety of real-worldprocesses that contain inherent randomness, e.g., queueing systems, inventorysystems, manufacturing systems, etc. In order to run simulations, you need togenerate various RVs, e.g., arrival times, service times, failure times, etc.
Examples: There are numerous ways to simulate RVs.
The Excel function RAND simulates a Unif(0,1) RV.
If U1, U2iid∼ Unif(0,1), then U1 + U2 is Triangular(0,1,2). This is simply
RAND()+RAND() in Excel.
The Inverse Transform Theorem gives (−1/λ)`n(U) ∼ Exp(λ).
If U1, U2, . . . , Ukiid∼ Unif(0,1), then
∑ki=1(−1/λ)`n(Ui) is Erlangk(λ)
because it is the sum of iid Exp(λ)’s.
It can be shown that X = d`n(U)/`n(1− p)e ∼ Geom(p), where d·e isthe “ceiling” (integer round-up) function.
ISYE 6739 — Goldsman 8/5/20 102 / 108
![Page 571: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/571.jpg)
Computer Stuff
Simulating Random Variables
Motivation: Simulations are used to evaluate a variety of real-worldprocesses that contain inherent randomness, e.g., queueing systems, inventorysystems, manufacturing systems, etc. In order to run simulations, you need togenerate various RVs, e.g., arrival times, service times, failure times, etc.
Examples: There are numerous ways to simulate RVs.
The Excel function RAND simulates a Unif(0,1) RV.
If U1, U2iid∼ Unif(0,1), then U1 + U2 is Triangular(0,1,2). This is simply
RAND()+RAND() in Excel.
The Inverse Transform Theorem gives (−1/λ)`n(U) ∼ Exp(λ).
If U1, U2, . . . , Ukiid∼ Unif(0,1), then
∑ki=1(−1/λ)`n(Ui) is Erlangk(λ)
because it is the sum of iid Exp(λ)’s.
It can be shown that X = d`n(U)/`n(1− p)e ∼ Geom(p), where d·e isthe “ceiling” (integer round-up) function.
ISYE 6739 — Goldsman 8/5/20 102 / 108
![Page 572: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/572.jpg)
Computer Stuff
Simulating Random Variables
Motivation: Simulations are used to evaluate a variety of real-worldprocesses that contain inherent randomness, e.g., queueing systems, inventorysystems, manufacturing systems, etc. In order to run simulations, you need togenerate various RVs, e.g., arrival times, service times, failure times, etc.
Examples: There are numerous ways to simulate RVs.
The Excel function RAND simulates a Unif(0,1) RV.
If U1, U2iid∼ Unif(0,1), then U1 + U2 is Triangular(0,1,2). This is simply
RAND()+RAND() in Excel.
The Inverse Transform Theorem gives (−1/λ)`n(U) ∼ Exp(λ).
If U1, U2, . . . , Ukiid∼ Unif(0,1), then
∑ki=1(−1/λ)`n(Ui) is Erlangk(λ)
because it is the sum of iid Exp(λ)’s.
It can be shown that X = d`n(U)/`n(1− p)e ∼ Geom(p), where d·e isthe “ceiling” (integer round-up) function.
ISYE 6739 — Goldsman 8/5/20 102 / 108
![Page 573: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/573.jpg)
Computer Stuff
The simulation of RVs is actually the topic of another course, but we will endthis module with a remarkable method for generating normal RVs.
Theorem (Box and Muller): If U1, U2iid∼ Unif(0, 1), then
Z1 =√−2`n(U1) cos(2πU2) and Z2 =
√−2`n(U1) sin(2πU2)
are iid Nor(0,1).
Example: Suppose that U1 = 0.3 and U2 = 0.8 are realizations of two iidUnif(0,1)’s. Box–Muller gives the following two iid standard normals.
Z1 =√−2`n(U1) cos(2πU2) = 0.480
Z2 =√−2`n(U1) sin(2πU2) = − 1.476. 2
ISYE 6739 — Goldsman 8/5/20 103 / 108
![Page 574: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/574.jpg)
Computer Stuff
The simulation of RVs is actually the topic of another course, but we will endthis module with a remarkable method for generating normal RVs.
Theorem (Box and Muller): If U1, U2iid∼ Unif(0, 1), then
Z1 =√−2`n(U1) cos(2πU2) and Z2 =
√−2`n(U1) sin(2πU2)
are iid Nor(0,1).
Example: Suppose that U1 = 0.3 and U2 = 0.8 are realizations of two iidUnif(0,1)’s. Box–Muller gives the following two iid standard normals.
Z1 =√−2`n(U1) cos(2πU2) = 0.480
Z2 =√−2`n(U1) sin(2πU2) = − 1.476. 2
ISYE 6739 — Goldsman 8/5/20 103 / 108
![Page 575: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/575.jpg)
Computer Stuff
The simulation of RVs is actually the topic of another course, but we will endthis module with a remarkable method for generating normal RVs.
Theorem (Box and Muller): If U1, U2iid∼ Unif(0, 1), then
Z1 =√−2`n(U1) cos(2πU2) and
Z2 =√−2`n(U1) sin(2πU2)
are iid Nor(0,1).
Example: Suppose that U1 = 0.3 and U2 = 0.8 are realizations of two iidUnif(0,1)’s. Box–Muller gives the following two iid standard normals.
Z1 =√−2`n(U1) cos(2πU2) = 0.480
Z2 =√−2`n(U1) sin(2πU2) = − 1.476. 2
ISYE 6739 — Goldsman 8/5/20 103 / 108
![Page 576: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/576.jpg)
Computer Stuff
The simulation of RVs is actually the topic of another course, but we will endthis module with a remarkable method for generating normal RVs.
Theorem (Box and Muller): If U1, U2iid∼ Unif(0, 1), then
Z1 =√−2`n(U1) cos(2πU2) and Z2 =
√−2`n(U1) sin(2πU2)
are iid Nor(0,1).
Example: Suppose that U1 = 0.3 and U2 = 0.8 are realizations of two iidUnif(0,1)’s. Box–Muller gives the following two iid standard normals.
Z1 =√−2`n(U1) cos(2πU2) = 0.480
Z2 =√−2`n(U1) sin(2πU2) = − 1.476. 2
ISYE 6739 — Goldsman 8/5/20 103 / 108
![Page 577: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/577.jpg)
Computer Stuff
The simulation of RVs is actually the topic of another course, but we will endthis module with a remarkable method for generating normal RVs.
Theorem (Box and Muller): If U1, U2iid∼ Unif(0, 1), then
Z1 =√−2`n(U1) cos(2πU2) and Z2 =
√−2`n(U1) sin(2πU2)
are iid Nor(0,1).
Example: Suppose that U1 = 0.3 and U2 = 0.8 are realizations of two iidUnif(0,1)’s. Box–Muller gives the following two iid standard normals.
Z1 =√−2`n(U1) cos(2πU2) = 0.480
Z2 =√−2`n(U1) sin(2πU2) = − 1.476. 2
ISYE 6739 — Goldsman 8/5/20 103 / 108
![Page 578: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/578.jpg)
Computer Stuff
The simulation of RVs is actually the topic of another course, but we will endthis module with a remarkable method for generating normal RVs.
Theorem (Box and Muller): If U1, U2iid∼ Unif(0, 1), then
Z1 =√−2`n(U1) cos(2πU2) and Z2 =
√−2`n(U1) sin(2πU2)
are iid Nor(0,1).
Example: Suppose that U1 = 0.3 and U2 = 0.8 are realizations of two iidUnif(0,1)’s. Box–Muller gives the following two iid standard normals.
Z1 =√−2`n(U1) cos(2πU2) = 0.480
Z2 =√−2`n(U1) sin(2πU2) = − 1.476. 2
ISYE 6739 — Goldsman 8/5/20 103 / 108
![Page 579: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/579.jpg)
Computer Stuff
The simulation of RVs is actually the topic of another course, but we will endthis module with a remarkable method for generating normal RVs.
Theorem (Box and Muller): If U1, U2iid∼ Unif(0, 1), then
Z1 =√−2`n(U1) cos(2πU2) and Z2 =
√−2`n(U1) sin(2πU2)
are iid Nor(0,1).
Example: Suppose that U1 = 0.3 and U2 = 0.8 are realizations of two iidUnif(0,1)’s. Box–Muller gives the following two iid standard normals.
Z1 =√−2`n(U1) cos(2πU2) = 0.480
Z2 =√−2`n(U1) sin(2πU2) = − 1.476. 2
ISYE 6739 — Goldsman 8/5/20 103 / 108
![Page 580: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/580.jpg)
Computer Stuff
The simulation of RVs is actually the topic of another course, but we will endthis module with a remarkable method for generating normal RVs.
Theorem (Box and Muller): If U1, U2iid∼ Unif(0, 1), then
Z1 =√−2`n(U1) cos(2πU2) and Z2 =
√−2`n(U1) sin(2πU2)
are iid Nor(0,1).
Example: Suppose that U1 = 0.3 and U2 = 0.8 are realizations of two iidUnif(0,1)’s. Box–Muller gives the following two iid standard normals.
Z1 =√−2`n(U1) cos(2πU2) = 0.480
Z2 =√−2`n(U1) sin(2πU2) = − 1.476. 2
ISYE 6739 — Goldsman 8/5/20 103 / 108
![Page 581: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/581.jpg)
Computer Stuff
Remarks:
There are many other ways to generate Nor(0,1)’s, but this is the perhapsthe easiest.
It is essential that the cosine and sine must be calculated in radians, notdegrees.
To get X ∼ Nor(µ, σ2) from Z ∼ Nor(0, 1), just take X = µ+ σZ.
Amazingly, it’s “Muller”, not “Müller”.See https://www.youtube.com/watch?v=nntGTK2Fhb0
ISYE 6739 — Goldsman 8/5/20 104 / 108
![Page 582: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/582.jpg)
Computer Stuff
Remarks:
There are many other ways to generate Nor(0,1)’s, but this is the perhapsthe easiest.
It is essential that the cosine and sine must be calculated in radians, notdegrees.
To get X ∼ Nor(µ, σ2) from Z ∼ Nor(0, 1), just take X = µ+ σZ.
Amazingly, it’s “Muller”, not “Müller”.See https://www.youtube.com/watch?v=nntGTK2Fhb0
ISYE 6739 — Goldsman 8/5/20 104 / 108
![Page 583: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/583.jpg)
Computer Stuff
Remarks:
There are many other ways to generate Nor(0,1)’s, but this is the perhapsthe easiest.
It is essential that the cosine and sine must be calculated in radians, notdegrees.
To get X ∼ Nor(µ, σ2) from Z ∼ Nor(0, 1), just take X = µ+ σZ.
Amazingly, it’s “Muller”, not “Müller”.See https://www.youtube.com/watch?v=nntGTK2Fhb0
ISYE 6739 — Goldsman 8/5/20 104 / 108
![Page 584: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/584.jpg)
Computer Stuff
Remarks:
There are many other ways to generate Nor(0,1)’s, but this is the perhapsthe easiest.
It is essential that the cosine and sine must be calculated in radians, notdegrees.
To get X ∼ Nor(µ, σ2) from Z ∼ Nor(0, 1), just take X = µ+ σZ.
Amazingly, it’s “Muller”, not “Müller”.See https://www.youtube.com/watch?v=nntGTK2Fhb0
ISYE 6739 — Goldsman 8/5/20 104 / 108
![Page 585: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/585.jpg)
Computer Stuff
Honors Proof: We follow the method from Module 3 to calculate the jointpdf of a function of two RVs. Namely, if we can express U1 = k1(Z1, Z2) andU2 = k2(Z1, Z2) for some functions k1(·, ·) and k2(·, ·), then the joint pdf of(Z1, Z2) is given by
g(z1, z2) = fU1
(k1(z1, z2)
)fU2
(k2(z1, z2)
) ∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣=
∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣ (U1 and U2 are iid Unif(0,1)
).
In order to obtain the functions k1(Z1, Z2) and k2(Z1, Z2), note that
Z21 + Z2
2 = − 2 `n(U1)[
cos2(2πU2) + sin2(2πU2)]
= − 2 `n(U1),
so thatU1 = e−(Z
21+Z
22 )/2.
ISYE 6739 — Goldsman 8/5/20 105 / 108
![Page 586: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/586.jpg)
Computer Stuff
Honors Proof: We follow the method from Module 3 to calculate the jointpdf of a function of two RVs. Namely, if we can express U1 = k1(Z1, Z2) andU2 = k2(Z1, Z2) for some functions k1(·, ·) and k2(·, ·), then the joint pdf of(Z1, Z2) is given by
g(z1, z2) = fU1
(k1(z1, z2)
)fU2
(k2(z1, z2)
) ∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣
=
∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣ (U1 and U2 are iid Unif(0,1)
).
In order to obtain the functions k1(Z1, Z2) and k2(Z1, Z2), note that
Z21 + Z2
2 = − 2 `n(U1)[
cos2(2πU2) + sin2(2πU2)]
= − 2 `n(U1),
so thatU1 = e−(Z
21+Z
22 )/2.
ISYE 6739 — Goldsman 8/5/20 105 / 108
![Page 587: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/587.jpg)
Computer Stuff
Honors Proof: We follow the method from Module 3 to calculate the jointpdf of a function of two RVs. Namely, if we can express U1 = k1(Z1, Z2) andU2 = k2(Z1, Z2) for some functions k1(·, ·) and k2(·, ·), then the joint pdf of(Z1, Z2) is given by
g(z1, z2) = fU1
(k1(z1, z2)
)fU2
(k2(z1, z2)
) ∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣=
∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣ (U1 and U2 are iid Unif(0,1)
).
In order to obtain the functions k1(Z1, Z2) and k2(Z1, Z2), note that
Z21 + Z2
2 = − 2 `n(U1)[
cos2(2πU2) + sin2(2πU2)]
= − 2 `n(U1),
so thatU1 = e−(Z
21+Z
22 )/2.
ISYE 6739 — Goldsman 8/5/20 105 / 108
![Page 588: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/588.jpg)
Computer Stuff
Honors Proof: We follow the method from Module 3 to calculate the jointpdf of a function of two RVs. Namely, if we can express U1 = k1(Z1, Z2) andU2 = k2(Z1, Z2) for some functions k1(·, ·) and k2(·, ·), then the joint pdf of(Z1, Z2) is given by
g(z1, z2) = fU1
(k1(z1, z2)
)fU2
(k2(z1, z2)
) ∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣=
∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣ (U1 and U2 are iid Unif(0,1)
).
In order to obtain the functions k1(Z1, Z2) and k2(Z1, Z2), note that
Z21 + Z2
2 = − 2 `n(U1)[
cos2(2πU2) + sin2(2πU2)]
= − 2 `n(U1),
so thatU1 = e−(Z
21+Z
22 )/2.
ISYE 6739 — Goldsman 8/5/20 105 / 108
![Page 589: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/589.jpg)
Computer Stuff
Honors Proof: We follow the method from Module 3 to calculate the jointpdf of a function of two RVs. Namely, if we can express U1 = k1(Z1, Z2) andU2 = k2(Z1, Z2) for some functions k1(·, ·) and k2(·, ·), then the joint pdf of(Z1, Z2) is given by
g(z1, z2) = fU1
(k1(z1, z2)
)fU2
(k2(z1, z2)
) ∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣=
∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣ (U1 and U2 are iid Unif(0,1)
).
In order to obtain the functions k1(Z1, Z2) and k2(Z1, Z2), note that
Z21 + Z2
2 = − 2 `n(U1)[
cos2(2πU2) + sin2(2πU2)]
= − 2 `n(U1),
so thatU1 = e−(Z
21+Z
22 )/2.
ISYE 6739 — Goldsman 8/5/20 105 / 108
![Page 590: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/590.jpg)
Computer Stuff
Honors Proof: We follow the method from Module 3 to calculate the jointpdf of a function of two RVs. Namely, if we can express U1 = k1(Z1, Z2) andU2 = k2(Z1, Z2) for some functions k1(·, ·) and k2(·, ·), then the joint pdf of(Z1, Z2) is given by
g(z1, z2) = fU1
(k1(z1, z2)
)fU2
(k2(z1, z2)
) ∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣=
∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣ (U1 and U2 are iid Unif(0,1)
).
In order to obtain the functions k1(Z1, Z2) and k2(Z1, Z2), note that
Z21 + Z2
2 = − 2 `n(U1)[
cos2(2πU2) + sin2(2πU2)]
= − 2 `n(U1),
so thatU1 = e−(Z
21+Z
22 )/2.
ISYE 6739 — Goldsman 8/5/20 105 / 108
![Page 591: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/591.jpg)
Computer Stuff
This immediately implies that
Z21 = −2 `n(U1) cos2(2πU2)
= −2 `n(e−(Z
21+Z
22 )/2)
cos2(2πU2)
= (Z21 + Z2
2 ) cos2(2πU2),
so that
U2 =1
2πarccos
(±
√Z21
Z21 + Z2
2
)=
1
2πarccos
(√Z21
Z21 + Z2
2
),
where we (non-rigorously) get rid of the “±” to balance off the fact that therange of y = arccos(x) is only regarded to be 0 ≤ y ≤ π (not 0 ≤ y ≤ 2π).
ISYE 6739 — Goldsman 8/5/20 106 / 108
![Page 592: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/592.jpg)
Computer Stuff
This immediately implies that
Z21 = −2 `n(U1) cos2(2πU2)
= −2 `n(e−(Z
21+Z
22 )/2)
cos2(2πU2)
= (Z21 + Z2
2 ) cos2(2πU2),
so that
U2 =1
2πarccos
(±
√Z21
Z21 + Z2
2
)=
1
2πarccos
(√Z21
Z21 + Z2
2
),
where we (non-rigorously) get rid of the “±” to balance off the fact that therange of y = arccos(x) is only regarded to be 0 ≤ y ≤ π (not 0 ≤ y ≤ 2π).
ISYE 6739 — Goldsman 8/5/20 106 / 108
![Page 593: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/593.jpg)
Computer Stuff
This immediately implies that
Z21 = −2 `n(U1) cos2(2πU2)
= −2 `n(e−(Z
21+Z
22 )/2)
cos2(2πU2)
= (Z21 + Z2
2 ) cos2(2πU2),
so that
U2 =1
2πarccos
(±
√Z21
Z21 + Z2
2
)=
1
2πarccos
(√Z21
Z21 + Z2
2
),
where we (non-rigorously) get rid of the “±” to balance off the fact that therange of y = arccos(x) is only regarded to be 0 ≤ y ≤ π (not 0 ≤ y ≤ 2π).
ISYE 6739 — Goldsman 8/5/20 106 / 108
![Page 594: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/594.jpg)
Computer Stuff
This immediately implies that
Z21 = −2 `n(U1) cos2(2πU2)
= −2 `n(e−(Z
21+Z
22 )/2)
cos2(2πU2)
= (Z21 + Z2
2 ) cos2(2πU2),
so that
U2 =1
2πarccos
(±
√Z21
Z21 + Z2
2
)=
1
2πarccos
(√Z21
Z21 + Z2
2
),
where we (non-rigorously) get rid of the “±” to balance off the fact that therange of y = arccos(x) is only regarded to be 0 ≤ y ≤ π (not 0 ≤ y ≤ 2π).
ISYE 6739 — Goldsman 8/5/20 106 / 108
![Page 595: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/595.jpg)
Computer Stuff
This immediately implies that
Z21 = −2 `n(U1) cos2(2πU2)
= −2 `n(e−(Z
21+Z
22 )/2)
cos2(2πU2)
= (Z21 + Z2
2 ) cos2(2πU2),
so that
U2 =1
2πarccos
(±
√Z21
Z21 + Z2
2
)=
1
2πarccos
(√Z21
Z21 + Z2
2
),
where we (non-rigorously) get rid of the “±” to balance off the fact that therange of y = arccos(x) is only regarded to be 0 ≤ y ≤ π (not 0 ≤ y ≤ 2π).
ISYE 6739 — Goldsman 8/5/20 106 / 108
![Page 596: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/596.jpg)
Computer Stuff
Now some derivative fun:
∂u1∂zi
=∂
∂zie−(z
21+z
22)/2 = − zi e−(z
21+z
22)/2, i = 1, 2,
∂u2∂z1
=∂
∂z1
1
2πarccos
(√z21
z21 + z22
)
=1
2π
−1√1− z21
z21+z22
∂
∂z1
√z21
z21 + z22(chain rule)
=1
2π
−1√z22
z21+z22
1
2
(z21
z21 + z22
)−1/2 ∂
∂z1
z21z21 + z22
(chain rule again)
=−(z21 + z22)
4πz1z2
2z1z22
(z21 + z22)2=
−z22π(z21 + z22)
, and
∂u2∂z2
=z1
2π(z21 + z22)(after similar algebra).
ISYE 6739 — Goldsman 8/5/20 107 / 108
![Page 597: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/597.jpg)
Computer Stuff
Now some derivative fun:∂u1∂zi
=∂
∂zie−(z
21+z
22)/2
= − zi e−(z21+z
22)/2, i = 1, 2,
∂u2∂z1
=∂
∂z1
1
2πarccos
(√z21
z21 + z22
)
=1
2π
−1√1− z21
z21+z22
∂
∂z1
√z21
z21 + z22(chain rule)
=1
2π
−1√z22
z21+z22
1
2
(z21
z21 + z22
)−1/2 ∂
∂z1
z21z21 + z22
(chain rule again)
=−(z21 + z22)
4πz1z2
2z1z22
(z21 + z22)2=
−z22π(z21 + z22)
, and
∂u2∂z2
=z1
2π(z21 + z22)(after similar algebra).
ISYE 6739 — Goldsman 8/5/20 107 / 108
![Page 598: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/598.jpg)
Computer Stuff
Now some derivative fun:∂u1∂zi
=∂
∂zie−(z
21+z
22)/2 = − zi e−(z
21+z
22)/2, i = 1, 2,
∂u2∂z1
=∂
∂z1
1
2πarccos
(√z21
z21 + z22
)
=1
2π
−1√1− z21
z21+z22
∂
∂z1
√z21
z21 + z22(chain rule)
=1
2π
−1√z22
z21+z22
1
2
(z21
z21 + z22
)−1/2 ∂
∂z1
z21z21 + z22
(chain rule again)
=−(z21 + z22)
4πz1z2
2z1z22
(z21 + z22)2=
−z22π(z21 + z22)
, and
∂u2∂z2
=z1
2π(z21 + z22)(after similar algebra).
ISYE 6739 — Goldsman 8/5/20 107 / 108
![Page 599: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/599.jpg)
Computer Stuff
Now some derivative fun:∂u1∂zi
=∂
∂zie−(z
21+z
22)/2 = − zi e−(z
21+z
22)/2, i = 1, 2,
∂u2∂z1
=∂
∂z1
1
2πarccos
(√z21
z21 + z22
)
=1
2π
−1√1− z21
z21+z22
∂
∂z1
√z21
z21 + z22(chain rule)
=1
2π
−1√z22
z21+z22
1
2
(z21
z21 + z22
)−1/2 ∂
∂z1
z21z21 + z22
(chain rule again)
=−(z21 + z22)
4πz1z2
2z1z22
(z21 + z22)2=
−z22π(z21 + z22)
, and
∂u2∂z2
=z1
2π(z21 + z22)(after similar algebra).
ISYE 6739 — Goldsman 8/5/20 107 / 108
![Page 600: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/600.jpg)
Computer Stuff
Now some derivative fun:∂u1∂zi
=∂
∂zie−(z
21+z
22)/2 = − zi e−(z
21+z
22)/2, i = 1, 2,
∂u2∂z1
=∂
∂z1
1
2πarccos
(√z21
z21 + z22
)
=1
2π
−1√1− z21
z21+z22
∂
∂z1
√z21
z21 + z22(chain rule)
=1
2π
−1√z22
z21+z22
1
2
(z21
z21 + z22
)−1/2 ∂
∂z1
z21z21 + z22
(chain rule again)
=−(z21 + z22)
4πz1z2
2z1z22
(z21 + z22)2=
−z22π(z21 + z22)
, and
∂u2∂z2
=z1
2π(z21 + z22)(after similar algebra).
ISYE 6739 — Goldsman 8/5/20 107 / 108
![Page 601: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/601.jpg)
Computer Stuff
Now some derivative fun:∂u1∂zi
=∂
∂zie−(z
21+z
22)/2 = − zi e−(z
21+z
22)/2, i = 1, 2,
∂u2∂z1
=∂
∂z1
1
2πarccos
(√z21
z21 + z22
)
=1
2π
−1√1− z21
z21+z22
∂
∂z1
√z21
z21 + z22(chain rule)
=1
2π
−1√z22
z21+z22
1
2
(z21
z21 + z22
)−1/2 ∂
∂z1
z21z21 + z22
(chain rule again)
=−(z21 + z22)
4πz1z2
2z1z22
(z21 + z22)2=
−z22π(z21 + z22)
, and
∂u2∂z2
=z1
2π(z21 + z22)(after similar algebra).
ISYE 6739 — Goldsman 8/5/20 107 / 108
![Page 602: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/602.jpg)
Computer Stuff
Now some derivative fun:∂u1∂zi
=∂
∂zie−(z
21+z
22)/2 = − zi e−(z
21+z
22)/2, i = 1, 2,
∂u2∂z1
=∂
∂z1
1
2πarccos
(√z21
z21 + z22
)
=1
2π
−1√1− z21
z21+z22
∂
∂z1
√z21
z21 + z22(chain rule)
=1
2π
−1√z22
z21+z22
1
2
(z21
z21 + z22
)−1/2 ∂
∂z1
z21z21 + z22
(chain rule again)
=−(z21 + z22)
4πz1z2
2z1z22
(z21 + z22)2
=−z2
2π(z21 + z22), and
∂u2∂z2
=z1
2π(z21 + z22)(after similar algebra).
ISYE 6739 — Goldsman 8/5/20 107 / 108
![Page 603: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/603.jpg)
Computer Stuff
Now some derivative fun:∂u1∂zi
=∂
∂zie−(z
21+z
22)/2 = − zi e−(z
21+z
22)/2, i = 1, 2,
∂u2∂z1
=∂
∂z1
1
2πarccos
(√z21
z21 + z22
)
=1
2π
−1√1− z21
z21+z22
∂
∂z1
√z21
z21 + z22(chain rule)
=1
2π
−1√z22
z21+z22
1
2
(z21
z21 + z22
)−1/2 ∂
∂z1
z21z21 + z22
(chain rule again)
=−(z21 + z22)
4πz1z2
2z1z22
(z21 + z22)2=
−z22π(z21 + z22)
, and
∂u2∂z2
=z1
2π(z21 + z22)(after similar algebra).
ISYE 6739 — Goldsman 8/5/20 107 / 108
![Page 604: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/604.jpg)
Computer Stuff
Now some derivative fun:∂u1∂zi
=∂
∂zie−(z
21+z
22)/2 = − zi e−(z
21+z
22)/2, i = 1, 2,
∂u2∂z1
=∂
∂z1
1
2πarccos
(√z21
z21 + z22
)
=1
2π
−1√1− z21
z21+z22
∂
∂z1
√z21
z21 + z22(chain rule)
=1
2π
−1√z22
z21+z22
1
2
(z21
z21 + z22
)−1/2 ∂
∂z1
z21z21 + z22
(chain rule again)
=−(z21 + z22)
4πz1z2
2z1z22
(z21 + z22)2=
−z22π(z21 + z22)
, and
∂u2∂z2
=z1
2π(z21 + z22)(after similar algebra).
ISYE 6739 — Goldsman 8/5/20 107 / 108
![Page 605: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/605.jpg)
Computer Stuff
Then we finally have
g(z1, z2) =
∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣=
∣∣∣∣−z1 e−(z21+z22)/2 z12π(z21 + z22)
− z22π(z21 + z22)
z2 e−(z21+z22)/2
∣∣∣∣=
1
2πe−(z
21+z
22)/2 =
(1√2π
e−z21/2
)(1√2π
e−z22/2
),
which is the product of two iid Nor(0,1) pdf’s, so we are done!
ISYE 6739 — Goldsman 8/5/20 108 / 108
![Page 606: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/606.jpg)
Computer Stuff
Then we finally have
g(z1, z2) =
∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣
=
∣∣∣∣−z1 e−(z21+z22)/2 z12π(z21 + z22)
− z22π(z21 + z22)
z2 e−(z21+z22)/2
∣∣∣∣=
1
2πe−(z
21+z
22)/2 =
(1√2π
e−z21/2
)(1√2π
e−z22/2
),
which is the product of two iid Nor(0,1) pdf’s, so we are done!
ISYE 6739 — Goldsman 8/5/20 108 / 108
![Page 607: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/607.jpg)
Computer Stuff
Then we finally have
g(z1, z2) =
∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣=
∣∣∣∣−z1 e−(z21+z22)/2 z12π(z21 + z22)
− z22π(z21 + z22)
z2 e−(z21+z22)/2
∣∣∣∣
=1
2πe−(z
21+z
22)/2 =
(1√2π
e−z21/2
)(1√2π
e−z22/2
),
which is the product of two iid Nor(0,1) pdf’s, so we are done!
ISYE 6739 — Goldsman 8/5/20 108 / 108
![Page 608: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/608.jpg)
Computer Stuff
Then we finally have
g(z1, z2) =
∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣=
∣∣∣∣−z1 e−(z21+z22)/2 z12π(z21 + z22)
− z22π(z21 + z22)
z2 e−(z21+z22)/2
∣∣∣∣=
1
2πe−(z
21+z
22)/2
=
(1√2π
e−z21/2
)(1√2π
e−z22/2
),
which is the product of two iid Nor(0,1) pdf’s, so we are done!
ISYE 6739 — Goldsman 8/5/20 108 / 108
![Page 609: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/609.jpg)
Computer Stuff
Then we finally have
g(z1, z2) =
∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣=
∣∣∣∣−z1 e−(z21+z22)/2 z12π(z21 + z22)
− z22π(z21 + z22)
z2 e−(z21+z22)/2
∣∣∣∣=
1
2πe−(z
21+z
22)/2 =
(1√2π
e−z21/2
)(1√2π
e−z22/2
),
which is the product of two iid Nor(0,1) pdf’s, so we are done!
ISYE 6739 — Goldsman 8/5/20 108 / 108
![Page 610: Dave Goldsman - gatech.edu](https://reader031.fdocuments.net/reader031/viewer/2022020911/620187c2b51c520ecc103302/html5/thumbnails/610.jpg)
Computer Stuff
Then we finally have
g(z1, z2) =
∣∣∣∣∂u1∂z1
∂u2∂z2− ∂u2∂z1
∂u1∂z2
∣∣∣∣=
∣∣∣∣−z1 e−(z21+z22)/2 z12π(z21 + z22)
− z22π(z21 + z22)
z2 e−(z21+z22)/2
∣∣∣∣=
1
2πe−(z
21+z
22)/2 =
(1√2π
e−z21/2
)(1√2π
e−z22/2
),
which is the product of two iid Nor(0,1) pdf’s, so we are done!
ISYE 6739 — Goldsman 8/5/20 108 / 108