1 CHAPTER 7 Subjective Probability and Bayesian Inference.
-
date post
21-Dec-2015 -
Category
Documents
-
view
231 -
download
2
Transcript of 1 CHAPTER 7 Subjective Probability and Bayesian Inference.
1
CHAPTER 7
Subjective Probability and Bayesian Inference
2
7. 1. Subjective Probability
Personal evaluation of probability by individual decision maker
Uncertainty exists for decision maker: probability is just a way of measuring it
In dealing with uncertainty, a coherent decision maker effectively uses subjective probability
3
7.2. Assessment of Subjective Probabilities
Simplest procedure:
Specify the set of all possible events, ask the decision maker to directly estimate
probability of each event
Not a good approach from psychological point of view
Not easy to conceptualize, especially for DM not familiar with probability
4
Standard Device
Physical instrument or conceptual model Good tool for obtaining subjective
probabilities Example
• A box containing 1000 balls• Balls numbered 1 to 1000• Balls have 2 colors: red, blue
5
Standard Device example To estimate a students’ subjective probability of
getting an “A” in SE 447, we ask him to choose between 2 bets:
Bet X: If he gets an A, he win SR 100If he doesn’t get A, he wins nothing
Bet Y: If he picks a red ball, he win SR 100If he picks a blue ball, he wins nothing
We start with proportion of red balls P = 50%,then adjust successively until 2 bets are equal
6
Other standard devices
Pie diagram (spinner)
Circle divided into 2 sectors: 1 red1 blue
Bet Y: If he spins to red section, he win SR 100 If he spins to blue section, he wins nothing
Size of red section is adjusted until 2 bets are equal
7
Subjective Probability Bias
Standard device must be easy to perceive, to avoid introducing bias
2 kinds of bias:
Task bias: resulting from assessment method (standard device)
Conceptual bias: resulting from mental procedures (heuristics) used by individuals to process information
8
Mental Heuristics Causing Bias1. Representativeness
• If x highly represents set A, high probability is given that X A
• Frequency (proportion) ignored• Sample size ignored
2. Availability• Limits of memory and imagination
3. Adjustment & anchoring • Starting from obvious reference point, then
adjusting for new values.• Anchoring: adjustment is typically not enough
Overconfidence• Underestimating variance
9
Fractile Probability Assessment
Quartile Assessment: Determine 3 values x1, for which p(x > x1) = 0.5 x2, for which p(x < x2) = p(x2 < x < x1) x3, for which p(x > x3) = p(x1 < x < x3)
x x2 x1 x3
F(x) 0.25 0.5 0.75
10
Fractile Probability Assessment
Quartile Assessment: 4 intervals
Octile Assessment: 8 intervals
Tertile Assessment: 3 intervals.
avoids anchoring
at the median
11
Histogram Probability Assessment
Fix the points x1, x2 , …, xm.
Ask the decision maker to assess probabilities
p(x1 < x < x2)
p(x2 < x < x3)
…
x1 x2 x3 … x
Gives probability distribution
(not cumulative p.d. as fractile method)
12
Assessment Methods & Bias
No evidence to favor either fractile or histogram methods
One factor that reinforces anchoring bias is self-consistency
Bias can be reduced by “pre-assessment conditioning”: training, for/against arguments
The act of probability assessment causes a re-evaluation of uncertainty
13
7.3. Impact of New Information(Bayes’ Theorem)
After developing subjective probability distribution
Assume new information becomes available Example: new data is collected
According to coherence principle, DM must take new information in consideration, thus
Subjective probability must be revised
How? using Bayes’ theorem
SS
14
Bayes’ Theorem Example
Suppose your subjective probability distribution for weather tomorrow is:
• chances of being sunny P(S) = 0.6• chances of being not sunny P(N) = 0.4
If the TV weather forecast predicted a cloudy day tomorrow. How should you change P(S)?
Assume we are dealing with mutually exclusive and collectively exhaustive events such as sunny or not sunny.
S
15
Impact of Information
We assume the weather forecaster predicts either • cloudy day C, or • bright day B.
To change P(S), we use the joint probability = conditional probability * marginal probability
• P(C,S) = P(C|S)P(S) P(B,S) = P(B|S)P(S)• P(C,N) = P(C|N)P(N) P(B,N)= P(B|N)P(N)
16
Impact of Information
To obtain j.p.m.f we need the conditional probabilities P(C|S) and P(C|N).
These can be obtained from historical data. How?
In past 100 sunny days, cloudy forecast in 20 daysP(C|S) = 0.2P(B|S) = 0.8
In past 100 cloudy days, cloudy forecast in 90 daysP(C|N) = 0.9P(B|N) = 0.1
17
Joint probability Calculations
Joint probability P(A,B) = conditional probability (likelihood) P(A|B) * marginal probability P(B)
Cloudy forecast• P(C,S) = P(C|S)P(S) = 0.2(0.6) = 0.12• P(C,N) = P(C|N)P(N) = 0.9(0.4) = 0.36
Sunny forecast• P(B,S) = P(B|S)P(S) = 0.8(0.6) = 0.48• P(B,N) = P(B|N)P(N) = 0.1(0.4) = 0.04
18
Bayes’ Theorem
S NC P(C,S) P(C,N) P(C) B P(B,S) P(B,N) P(B) P(S) P(N)
P(S|C) = P(C,S) / P(C)= P(C|S)P(S ) / P(C)
P(S|C) P(C|S)P(S )
19
Joint probability Table
S NC 0.12 0.36 0.48B 0.48 0.04 0.52 0.6 0.4
P(S|C) = 0.12/0.48 = 0.25 posterior (conditional) probability
Compare to P(S) = 0.6 prior (marginal) prob.
P(S) decreased because of C forecast
20
Prior and Posterior Probabilities
Prior means before.
Prior probability is the probability P(S) before the information was heard.
Posterior means after.
It is probability obtained after incorporating the new forecast information. It is P(S|C).
It is obtained using Bayes’ theorem.
21
Example with 3 states
3 demand possibilities for new product High P(H) = 0.6 Medium P(M) = 0.1 Low P(L) = 0.3
Market research gives Average result: 30% of time if true demand is High 50% of time if true demand is Medium 90% of time if true demand is Low
22
Example : Probability Table for Average result
State Prior Likelihood Joint PosteriorS P(S) P(A|S) P(S,A) P(S|A)H 0.6 0.3 0.18 0.36M 0.1 0.5 0.05 0.10L 0.3 0.9 0.27 0.54 1.0 0.50 1.00
If market research gives Average result:P(H), P(L),P(M)
23
Ex: Sequential Bayesian Analysis
An oil company has 3 drilling sites: X, Y, Z
3 possible reserve states: No reserves P(N) = 0.5 Small reserves P(S) = 0.3 Large reserves P(L) = 0.2
3 possible drilling outcomes: Dry D Wet W Gushing G
24
Ex: Sequential Bayesian Analysis
If reserves are: None (N) all wells will be dry (D) Large (L) all wells will be Gushing (G) Small (S) some Dry (D) and some wet (W)
wells P(1D/1) = 0.8P(2D/2) = 0.2P(3D/3) = 0
All sites are equally favorable Assume order of drilling is: XYZ Notation: DX = probability of Dry well at site X
25
Ex: Probability Table for Site X
State Prior Conditional Joint
K P(K) P(D|K) P(W|K) P(G|K) DX WX GX
N 0.5 1 0 0 0.5 0 0S 0.3 0.8 0.2 0 0.24 0.06 0L 0.2 0 0 1 0 0 0.2 1.0 0.74 0.06 0.2
Stop exploratory drilling at X if you get: W: Reserves are S, or G: Reserves are L
If you get D, drill at Y..(Res. N: P = 0.5/0.74 = 0.68, S: P = 0.24/0.74 = 0.32)
26
Ex: Probability Table for Site Y
State Prior Conditional Joint
K P(K) P(D|K) P(W|K) DY WY
N 0.68 1 0 0.68 0S 0.32 0.5 0.5 0.16 0.16 = 0.4/0.8 1.0 0.84 0.16
Stop exploratory drilling at Y if you get W: Reserves are S
If you get D, drill at Z(Res. N: P = 0.68/0.84 = 0.81, S: P = 0.16/0.84 = 0.19)
27
Ex: Probability Table for Site Z
State Prior Conditional Joint
K P(K) P(D|K) P(W|K) DZ WZ
N 0.81 1 0 0.81 0S 0.19 0 1 0 0.19 1.0 0.81 0.19
If you get W: Reserves are S
If you get D:Reserves are N
28
Ex: Change in P(N)
Prior Probability Before drilling P(N) = 0.5
Posterior Probabilities After 1 Dry P(N|Dx) = 0.68
After 2 Dry P(N|Dx, DY) = 0.81
After 3 Dry P(N|Dx, DY, DZ) = 1
29
7.4. Conditional Independence
Two events, A and B, are independent iff: P(A, B) = P(A)*P(B)Implying P(A|B) = P(A) P(B|A) = P(B)
Posterior probability is the same the prior New information about 1 event does not
affect the probability of the other
30
Conditional Independence
Two events, A and B, are conditionally independent iff:
P(A, B) P(A)*P(B)
But their conditional probabilities on a 3rd event, C, are independent
P(A, B|C) = P(A|C)*P(B|C)
Useful property in Bayesian analysis
31
Ex: Horse Race Probability
Horse named WR will race at 3:00 p.m.
Probability of WR winning P(WR) depends on track condition
Firm (F) P(F) = 0.3 P(WR|F) = 0.9
Soft (S) P(S) = 0.7 P(WR|S) = 0.2
32
Ex: Horse Race Probability
Given the results of 2 previous races
At 1:30, horse named MW won the race P(MW|F) = 0.8 P(MW|S) = 0.4
At 2:00, horse named AJ won the race P(AJ|F) = 0.9 P(AJ|S) = 0.5
What the new WR win probability P(WR|MW, AJ)?
33
Ex: Horse Race Probability
P(WR) wins given MW and AJ have won must sum conditional probabilities of both possible track conditions: F or S
P(WR|MW, AJ) =
P(WR|F) * P(F|MW, AJ) + P(WR|S) * P(S|MW,
AJ)
34
Ex: Horse Race Probability
The 2 events MW and AJ are conditionally independent with respect to a 3rd event: track condition F or S
Recall: P(A|B) = P(B|A)P(A)/P(B)P(A|B) P(B|A)P(A)
P(F|MW, AJ) P(MW, AJ|F) P(F) P(MW|F) P(AJ|F) P(F)
P(S|MW, AJ) P(MW, AJ|S) P(S) P(MW|S) P(AJ|S) P(S)
35
Ex: Horse Race Probability
P(F|MW, AJ) P(MW|F) P(AJ|F) P(F)
0.8 * 0.9 * 0.3 = 0.216
P(S|MW, AJ) P(MW|S) P(AJ|S) P(S)
0.4 * 0.5 * 0.7 = 0.14
Normalizing
P(F|MW, AJ) = 0.216/(0.216 + 0.14) = 0.61
P(F|MW, AJ) = 0. 14/(0.216 + 0.14) = 0.39
36
Ex: Horse Race Probability
Substituting into
P(WR|MW, AJ) =
P(WR|F) * P(F|MW, AJ) + P(WR|S) * P(S|MW,
AJ)
= 0.9(0.61) + 0.2(0.39)
= 0.63
37
7.5. Bayesian Updating with Functional Likelihoods
Posterior probabilityP(A|B) = P(A,B )/P(B)
= P(B|A)P(A)/P(B)
Conditional probability (likelihood) P(B|A) can be described by particular probability distribution:• (1) Binomial• (2) Normal
38
Binomial likelihood distribution
Dichotomous (2-value) data: defective, not Sequence of dichotomous outcomes: series of
quality tests In each test, constant probability (p) of one of the
2 outcomes: defective Outcome of each test is independent of others Total number (r) of one kind of outcomes
(defective) out of (n) tests is
f(r|n, p) = Given in tables rnr pp
r
n
)1(
39
Binomial likelihood exampleDemand for new product can be: High (H) P(H) = 0.2 Medium (M) P(M) = 0.3 Low (L) P(L) = 0.5
For each case, the probability (p) that an individual customer buys the product is
H: p = 0.25 M: p = 0.1 L: p = 0.05
In random sample of 5 customers, 1 will buy
40
Binomial likelihood example
For (n = 5, r = 1), likelihoods are obtained from table, or calculated by:
H: [5!/(4!*1)]0.25(0.75)4 = 0.3955 M: [5!/(4!*1)]0.1(0.9)4 = 0.3281 L: [5!/(4!*1)]0.05(0.95)4 = 0.2036
The joint probability table can now be constructed
151 )1(1
5
pp
41
Binomial Likelihood Example
State Prior Likelihood Joint Posterior
p P(p) P(1|5,p) P(p,1/5) P(p|1/5)H: 0.25 0.2 0.3955 0.0791 0.2832M: 0.1 0.3 0.3281 0.0984 0.3523L: 0.05 0.5 0.2036 0.1018 0.3645 1.0 0.2793 1.00
If 1 in a sample of 5 customers buys:P(H), P(M), P(L)
42
Normal likelihood distribution
Most common, symmetric, Continuous data, can approximate discrete
f(y|, ) =
Given in tables. Formula usually not used.
Two parameters: mean () and standard deviation ().
2/12
2
)2(
}]/)[(5.0exp{
y
43
Normal likelihood Updating
Mean () and standard deviation () Can be updated individually (assuming one is known) or together
P(|y, ) f(y|, ) P(|) P(|y, ) f(y|, ) P(|) P(, |y) f(y|, ) P(, )
44
Normal likelihood example
Updating Mean ()
Average weight setting has 2 possibilities:
High (H) P(H) = 0.5 = 8.2, = 0.1 Low (L) P(L) = 0.5 = 7.9, = 0.1
A sample of 1 bottle has weight = 8.0 oz.What is the posterior probability of H and L?
45
Normal likelihood example
Likelihood values are obtained from table, or calculated by
If = 8.2Z = (8.0 – 8.2)/0.1 = – 2f(8|, ) = 0.054
If = 7.9Z = (8.0 – 7.9)/0.1 = 1f(8|, ) = 0.242
2/1
2
)]01.0(2[
}]1.0/)8[(5.0exp{
46
Normal Likelihood Example
State Prior Likelihood Joint Posterior
P() P(8|, ) P(8,,) P( |8)
H: 8.2 0.5 0.054 0.027 0.18L: 7.9 0.5 0.242 0.121 0.82
1.0 0.148 1.00
After a sample of 1 bottle with weight = 8:P(H), P(L)