Conditional Prob. & Discrete Distrib.tcs.inf.kyushu-u.ac.jp/~kijima/GPS19/GPS19-02.pdfConditional...
Transcript of Conditional Prob. & Discrete Distrib.tcs.inf.kyushu-u.ac.jp/~kijima/GPS19/GPS19-02.pdfConditional...
Conditional Prob. & Discrete Distrib.
April 24, 2019
来嶋 秀治 (Shuji Kijima)
Dept. Informatics, ISEE
Todays topics
• Bayes’ theorem
• Probability distributions
• Discrete distributions and expectations
確率統計特論 (Probability & Statistics)
lesson 2
Probability Space
Definitions
Axiom
Terminology
quick review of the last class & exercise
Ex 2. Bertrand paradox3
Consider an equilateral triangle inscribed in a circle.
Suppose a chord of the circle is chosen at random.
Question
What is the probability that the chord is
longer than a side of the triangle (=:x)?
Ex 2. Bertrand paradox4
Consider an equilateral triangle inscribed in a circle.
Suppose a chord of the circle is chosen at random.
Question
What is the probability that the chord is
longer than a side of the triangle (=:x)?
What does
“a chord of the circle is chosen at random”
mean?
5
A probability space is defined by (, F, P)
: sample space(標本空間); a set of elementally events(標本点),
an event(事象) is a subset of .
F: -algebra ( 2); a set of events.
P: probability measure(確率測度); a function F R0,
probability of an event.
Definition: Probability Space
Ex 2. Bertrand paradox6
Consider an equilateral triangle inscribed in a circle.
Suppose a chord of the circle is chosen at random.
Question
What is the probability that the chord is
longer than a side of the triangle (=:x)?
Answer 1: The "random radius" method:
Choose a radius of the circle and a point on the radius.
the chord through this point and perpendicular to the radius.
The chord is longer than x
iff the chosen point on a blue line.
the probability is 1/2
Ω1 = {𝑑 ∈ 𝑅 ∣ 0 ≤ 𝑑 ≤ 𝑟}
𝐹1 = 2Ω1
𝑃1 𝑑 ≤ 𝑦 =𝑦
𝑟
Answer 2: The "random endpoints" method::
Choose two random points on the circumference of the circle and
draw the chord joining them.
The chord is longer than x
iff /3 2/3
the probability is 1/3
Ex 2. Bertrand paradox7
Consider an equilateral triangle inscribed in a circle.
Suppose a chord of the circle is chosen at random.
Question
What is the probability that the chord is
longer than a side of the triangle (=:x)?
w.l.o.g one end point is (r,0)
Ω2 = {𝜃 ∈ 𝑅 ∣ 0 ≤ 𝜃 < 𝜋}
𝐹2 = 2Ω2
𝑃2 𝜃 ≤ 𝑦 =𝑦
𝜋
Ex 2. Bertrand paradox8
Consider an equilateral triangle inscribed in a circle.
Suppose a chord of the circle is chosen at random.
Question
What is the probability that the chord is
longer than a side of the triangle (=:x)?
Answer 3: The "random midpoint" method.
Choose a point anywhere within the circle, and
construct a chord with the chosen point as its midpoint.
The chord is longer than x
iff the chosen point within a small circle.
the probability is 1/4
Ω3 = { 𝑎, 𝑏 ∈ 𝑅2 ∣ 𝑎2 + 𝑏2 ≤ 𝑟2}
𝐹3 = 2Ω3
𝑃3 𝑎2 + 𝑏2 ≤ 𝑥2 =𝑥
𝑟
2
Ex. 3. Boy or Girl9
Question 1.
Desmond and Molly has two kids. One is a boy.
What is the probability that the other is a girl?
Conditional Probability
def.s
joint probability
conditional probability
independence / mutually independence
thm.
Bayes’ theorem
Answer for the Monty Hall Problem
Today’s topic 1
Terminology11
Def. 1. Joint probability; (同時確率 or 結合確率)
Pr 𝐴, 𝐵 = Pr(𝐴 ∩ 𝐵)
Def. 2. Conditional Probability (条件付き確率)
Pr 𝐴 𝐵 =Pr 𝐴, 𝐵
Pr(𝐵)
Def. 3. Events 𝐴 and 𝐵 are independent (独立)
Pr 𝐴, 𝐵 = Pr(𝐴) Pr 𝐵
Events 𝐴1, 𝐴2, … , 𝐴𝑘 are mutually independent (相互に独立)
Pr ∩𝑖=1𝑘 𝐴𝑖 = ς𝑖=1
𝑘 Pr(𝐴𝑖)
Events 𝐴1, 𝐴2, … , 𝐴𝑘 are pairwise independent (対ごとに独立)
Pr 𝐴𝑖 , 𝐴𝑗 = Pr(𝐴𝑖) Pr 𝐴𝑗 for any distinct 𝑖, 𝑗
see ex. 1.
Tossing coins (Independence)12
Suppose two coins.
Head probability of coin A is 0.5.
Head probability of coin B is 0.5.
The probability of two heads
Pr H , H = Pr H Pr([H]) =1
4
Tossing coins (Independence)13
Suppose two coins.
Head probability of coin A is 0.6.
Head probability of coin B is 0.7.
The probability of two heads
Pr H , H = Pr H Pr([H]) = 0.42
H T Prob.
H 0.42 0.18 0.6
T 0.28 0.12 0.4
Prob. 0.7 0.3
Tossing coins (Dependence)14
Two coins are made of magnets.
Head probability of coin A is 0.5.
Head probability of coin B is 0.5.
N
S
N S
S
N iron
Tossing coins (Dependence)15
Two coins are made of magnets.
Head probability of coin A is 0.5.
Head probability of coin B is 0.5.
The probability of two heads
Pr H , H = Pr H Pr([H])
H T Prob.
H 0.05 0.45 0.5
T 0.45 0.05 0.5
Prob. 0.5 0.5
N
S
N S
S
N iron
?
Independence test16
Good
(early healing)
No goodTotal
Med. 28 22 50
Placebo 13 37 50
Total 41 59 100
Pr med. , good = Pr med Pr(good)
?
Bayes’ theorem17
Thm. (Bayes; ベイズ)
Pr 𝐴 𝐵) =Pr 𝐵 | 𝐴 Pr(𝐴)
Pr(𝐵)
Bayes’ theorem (general)18
Thm. (Bayes; ベイズ)
𝐴1, … , 𝐴𝑘 are mutually exclusive, and ∪𝑖=1𝑘 𝐴𝑖 = Ω.
Pr 𝐴𝑖 𝐵) =Pr 𝐵 𝐴𝑖 Pr(𝐴𝑖)
σ𝑗=1𝑘 Pr 𝐵 𝐴𝑗) Pr(𝐴𝑗)
Prop.
𝐴1, … , 𝐴𝑘 are mutually exclusive, and ∪𝑖=1𝑘 𝐴𝑖 = Ω.
Pr 𝐵 =
𝑖=1
𝑘
Pr(𝐴𝑖 , 𝐵)
(the right hand side) is called marginal distribution.
Conditional Probability19
A B
Conditional Probability
Pr ○ | 𝐴 =Pr ○, 𝐴
Pr(𝐴)=
12∗ 0.6
12
= 0.6
Bayes’ probability20
A B
Bayes’ probability
Pr 𝐴| ○ =Pr ○ |𝐴 Pr(𝐴)
Pr(○)=0.6 ∗
12
820
=3
4
Ex. 3. Boy or Girl21
Question 1.
Desmond and Molly has two kids. One is a boy.
What is the probability that the other is a girl?
Ex. 3. Boy or Girl22
Question 1.
Desmond and Molly has two kids. One is a boy.
What is the probability that the other is a girl?
Elder Younger Prob.
Case 1 Boy Boy 1/4
Case 2 Boy Girl 1/4
Case 3 Girl Boy 1/4
Case 4 Girl Girl 1/4
Pr G B =Pr[B, G]
Pr[𝐵]=2/4
3/4=2
3
Ex 1. Monty Hall problem --- ask Marilyn23
You are given the choice of three doors:
Behind on door is a car; behind the others goats.
You pick a door, say A.
The host (Monty), who knows what's behind the doors,
opens another door, say C, which he knows has a goat.
He then says to you, "Do you want to pick door B?"
Question
Is it to your advantage to switch your choice?
図: wikipedia”モンティーホール問題”より
Ex 1. Monty Hall problem --- ask Marilyn24
You are given the choice of three doors:
Behind on door is a car; behind the others goats.
You pick a door, say A.
The host (Monty), who knows what's behind the doors,
opens another door, say C, which he knows has a goat.
He then says to you, "Do you want to pick door B?"
図: wikipedia”モンティーホール問題”より
cf. ex. 3.
Discrete Probability & Expectation
def.s
discrete random variable
discrete distribution
expectation / conditional expectation
thm.
Linearity of the expectation
Coupon collector
Today’s topic 2
26
“variable” vs “random variable”
Ex. 1. Set Ω
Ω = 1,2,3,4,5,6
Let 𝑥 be a member of Set Ω.
Observation
𝑥 ∈ Ω
27
Def. random variable
Ex. 1. die Ω,ℱ, 𝑃
Ω = 1,2,3,4,5,6
ℱ = 2Ω
𝑃 𝐴 =𝐴
6for any 𝐴 ⊆ Ω.
Let 𝑋 denote the “cast” of Ω,ℱ, 𝑃Observation
𝑋 ∈ Ω (∈ ℱ in fact)
𝑃 𝑋 is odd =1
2
𝑃 𝑋 < 5 =2
3etc.
Note
random variable may not be a member of ℱ.
e.g., Let 𝑌 ≔ square of castwhere, there is a map from ℱ. (see regime)
called random variable.
(usually denoted by CAPITALS)
terminology28
Discrete distribution (離散分布)
distribution on countable set Ξ ⊆ ℝ such that
σ𝑥∈ΞPr 𝑋 = 𝑥 = 1 holds
Probability function (確率関数)
𝑓 𝑥 = Pr 𝑋 = 𝑥
(cumulative) distribution function ((累積)分布関数)
𝐹 𝑥 = Pr 𝑋 ≤ 𝑥
note Ξ may not be Ω (cf. ex. 6)
important concept
in continuous distr.
(next week)
𝑋 is called “random variable (確率変数)”
(univariate) discrete distributions
uniform dist. (離散一様分布)
Bernoulli dist. (ベルヌーイ分布; 2点分布)
binomial dist. (2項分布)
geometric dist. (幾何分布)
Poisson dist. (ポアソン分布)
30
discrete uniform (離散一様分布)
Ω = 1,2,… , 𝑛
Pr 𝑋 = 𝑖 =1
𝑛
Ω = 0,1,2,… , 36
ℱ = 2Ω
Pr 𝑋 = 𝑥 =1
37(𝑥 ∈ Ω)
roulette
31
Bernoulli (ベルヌーイ分布, 2点分布) B(1;p)
Ω = 0,1
Pr 𝑋 = 1 = 𝑝
Pr 𝑋 = 0 = 1 − 𝑝
An experiment outputting a random variable
according to Bernoulli dist. is said
Bernoulli trial (ベルヌーイ試行).
(biased) coin tossing
head (𝑋 = 1)
tail (𝑋 = 0)
32
binomial dist. (2項分布) B 𝑛; 𝑝
Ω = 0,1,2,… , 𝑛
Pr 𝑋 = 𝑘 =𝑛
𝑘𝑝𝑘 1 − 𝑝 𝑛−𝑘
Let 𝑋1, 𝑋2, … , 𝑋𝑛 be outputs of Bernoulli trial (B 1; 𝑝 ), i.i.d.
Let 𝑋 = 𝑋1 + 𝑋2 +⋯+ 𝑋𝑛
meaning that the total number of heads.
𝑋 is according to a binomial distribution B 𝑛; 𝑝
33
geometric dist. (幾何分布) Ge(p)
Ω = 0,1,2,…
Pr 𝑋 = 𝑘 = 1 − 𝑝 𝑘𝑝
Repeat Bernoulli trials B 1; 𝑝 i.i.d., until head.
Let 𝐾 denote the number of tail before head,
then 𝐾 is according to a geometric distribution Ge 𝑝 .
Remember coupon collector.
34
Poisson dist. (ポアソン分布) Po() (>0)
Ω = 0,1,2,…
Pr 𝑋 = 𝑧 = 𝑒−𝜆𝜆𝑧
𝑧!
Let’s consider the probability of rare events,
the expected number of occurrences is 𝜆 in a unit time.
Let 𝑋 be the number of occurrences,
then 𝑋 is known to be according to the Poisson distr. Po(𝜆).
More precisely, repeat Bernoulli trials B 1; 𝑝 i.i.d. with 𝑝 ≪ 1.
Let 𝜆 = 𝑛𝑝, then it is known that B 𝑛; 𝑝 ≃ Po(𝜆).
today’s Exercise 2. Poisson distr. appears later today.
35
Discrete distr.: (distr. on a countable set R)
σ𝑥∈ΩPr 𝑋 = 𝑥 = 1 holds.
probability function (確率関数)
𝑓 𝑥 = Pr 𝑋 = 𝑥
(cumulative) distribution function ((累積)分布関数)
𝐹 𝑋 = Pr 𝑋 ≤ 𝑥
1
P
x
F(x)
1 2 3 4 5 6
1/6
2/6
3/6
4/65/6
36
Discrete distr.: (distr. on a countable set R)
σ𝑥∈ΩPr 𝑋 = 𝑥 = 1 holds.
probability function (確率関数)
𝑓 𝑥 = Pr 𝑋 = 𝑥
(cumulative) distribution function ((累積)分布関数)
𝐹 𝑋 = Pr 𝑋 ≤ 𝑥
1
P
x
F(x)
1 2 3 4 5 6
1/6
2/6
3/6
4/65/6
Discrete Distribution Function 𝐹: Ω → R≥0
1. 𝐹 −∞ = 0, 𝐹 +∞ = 1
2. Monotone non-decreasing (単調非減少)
3. Right continuous (右連続)
Expectation of random variable
Today’s topic 2
Expectation of discrete random variable38
Expectation (期待値) of a discrete random variable X is defined by
E 𝑋 =
𝑥∈Ω
𝑥 ⋅ 𝑓 𝑥
only when the right hand side is converged absolutely (絶対収束),
i.e., σ𝑥∈Ω 𝑥 ⋅ 𝑓 𝑥 < ∞ holds.
If it is not the case, we say “expectation does not exist.”
Compute expectations of distributions39
*Ex 2.
Discrete
(*i) Bernoulli distribution B 1, 𝑝 .
(*ii) Binomial distribution B 𝑛, 𝑝 .
(iii) Geometric distribution Ge 𝑝 .
(iv) Poisson distribution Po 𝜆 .
Ex. Expectation of Geom. distr. 40
Thm.
The expectation of 𝑋 ∼ 𝐵 𝑛, 𝑝 is 𝑛𝑝
proof
𝑘=0
𝑛
𝑘𝑛
𝑘𝑝𝑘 1 − 𝑝 𝑛−𝑘 =
𝑘=0
𝑛
𝑘𝑛!
𝑘! 𝑛 − 𝑘 !𝑝𝑘 1 − 𝑝 𝑛−𝑘
=
𝑘=1
𝑛
𝑘𝑛!
𝑘! 𝑛 − 𝑘 !𝑝𝑘 1 − 𝑝 𝑛−𝑘
=
𝑘=1
𝑛𝑛!
(𝑘 − 1)! 𝑛 − 𝑘 !𝑝𝑘 1 − 𝑝 𝑛−𝑘
=
𝑘=1
𝑛
𝑛𝑝(𝑛 − 1)!
(𝑘 − 1)! 𝑛 − 𝑘 !𝑝𝑘−1 1 − 𝑝 𝑛−𝑘
= 𝑛𝑝
𝑘′=0
𝑛−1𝑛 − 1
𝑘′𝑝𝑘
′1 − 𝑝 𝑛−1−𝑘′
= 𝑛𝑝
Ex. Expectation of Geom. distr. 41
Thm.
The expectation of 𝑋 ∼ Ge 𝑝 is 1−𝑝
𝑝.
Proof
E 𝑋 = 0 𝑝 + 1 1 − 𝑝 𝑝 + 2 1 − 𝑝 2𝑝 + 3 1 − 𝑝 3𝑝 +⋯−) 1 − 𝑝 E 𝑋 = 0 1 − 𝑝 𝑝 + 1 1 − 𝑝 2𝑝 + 2 1 − 𝑝 3𝑝 +⋯
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−𝑝E 𝑋 = 1 − 𝑝 𝑝 + 1 − 𝑝 2𝑝 + 1 − 𝑝 3𝑝 +⋯
=1 − 𝑝 𝑝
1 − (1 − 𝑝)= 1 − 𝑝
Thus E 𝑋 =1−𝑝
𝑝.