1 CHAPTER 7 Subjective Probability and Bayesian Inference.

1

CHAPTER 7

Subjective Probability and Bayesian Inference

2

7. 1. Subjective Probability

Personal evaluation of probability by individual decision maker

Uncertainty exists for decision maker: probability is just a way of measuring it

In dealing with uncertainty, a coherent decision maker effectively uses subjective probability

3

7.2. Assessment of Subjective Probabilities

Simplest procedure:

Specify the set of all possible events, ask the decision maker to directly estimate

probability of each event

Not a good approach from psychological point of view

Not easy to conceptualize, especially for DM not familiar with probability

4

Standard Device

Physical instrument or conceptual model Good tool for obtaining subjective

probabilities Example

• A box containing 1000 balls• Balls numbered 1 to 1000• Balls have 2 colors: red, blue

5

Standard Device example To estimate a students’ subjective probability of

getting an “A” in SE 447, we ask him to choose between 2 bets:

Bet X: If he gets an A, he win SR 100If he doesn’t get A, he wins nothing

Bet Y: If he picks a red ball, he win SR 100If he picks a blue ball, he wins nothing

We start with proportion of red balls P = 50%,then adjust successively until 2 bets are equal

6

Other standard devices

Pie diagram (spinner)

Circle divided into 2 sectors: 1 red1 blue

Bet Y: If he spins to red section, he win SR 100 If he spins to blue section, he wins nothing

Size of red section is adjusted until 2 bets are equal

7

Subjective Probability Bias

Standard device must be easy to perceive, to avoid introducing bias

2 kinds of bias:

Task bias: resulting from assessment method (standard device)

Conceptual bias: resulting from mental procedures (heuristics) used by individuals to process information

8

Mental Heuristics Causing Bias1. Representativeness

• If x highly represents set A, high probability is given that X A

• Frequency (proportion) ignored• Sample size ignored

2. Availability• Limits of memory and imagination

3. Adjustment & anchoring • Starting from obvious reference point, then

adjusting for new values.• Anchoring: adjustment is typically not enough

Overconfidence• Underestimating variance

9

Fractile Probability Assessment

Quartile Assessment: Determine 3 values x1, for which p(x > x1) = 0.5 x2, for which p(x < x2) = p(x2 < x < x1) x3, for which p(x > x3) = p(x1 < x < x3)

x x2 x1 x3

F(x) 0.25 0.5 0.75

10

Fractile Probability Assessment

Quartile Assessment: 4 intervals

Octile Assessment: 8 intervals

Tertile Assessment: 3 intervals.

avoids anchoring

at the median

11

Histogram Probability Assessment

Fix the points x1, x2 , …, xm.

Ask the decision maker to assess probabilities

p(x1 < x < x2)

p(x2 < x < x3)

…

x1 x2 x3 … x

Gives probability distribution

(not cumulative p.d. as fractile method)

12

Assessment Methods & Bias

No evidence to favor either fractile or histogram methods

One factor that reinforces anchoring bias is self-consistency

Bias can be reduced by “pre-assessment conditioning”: training, for/against arguments

The act of probability assessment causes a re-evaluation of uncertainty

13

7.3. Impact of New Information(Bayes’ Theorem)

After developing subjective probability distribution

Assume new information becomes available Example: new data is collected

According to coherence principle, DM must take new information in consideration, thus

Subjective probability must be revised

How? using Bayes’ theorem

SS

14

Bayes’ Theorem Example

Suppose your subjective probability distribution for weather tomorrow is:

• chances of being sunny P(S) = 0.6• chances of being not sunny P(N) = 0.4

If the TV weather forecast predicted a cloudy day tomorrow. How should you change P(S)?

Assume we are dealing with mutually exclusive and collectively exhaustive events such as sunny or not sunny.

S

15

Impact of Information

We assume the weather forecaster predicts either • cloudy day C, or • bright day B.

To change P(S), we use the joint probability = conditional probability * marginal probability

• P(C,S) = P(C|S)P(S) P(B,S) = P(B|S)P(S)• P(C,N) = P(C|N)P(N) P(B,N)= P(B|N)P(N)

16

Impact of Information

To obtain j.p.m.f we need the conditional probabilities P(C|S) and P(C|N).

These can be obtained from historical data. How?

In past 100 sunny days, cloudy forecast in 20 daysP(C|S) = 0.2P(B|S) = 0.8

In past 100 cloudy days, cloudy forecast in 90 daysP(C|N) = 0.9P(B|N) = 0.1

17

Joint probability Calculations

Joint probability P(A,B) = conditional probability (likelihood) P(A|B) * marginal probability P(B)

Cloudy forecast• P(C,S) = P(C|S)P(S) = 0.2(0.6) = 0.12• P(C,N) = P(C|N)P(N) = 0.9(0.4) = 0.36

Sunny forecast• P(B,S) = P(B|S)P(S) = 0.8(0.6) = 0.48• P(B,N) = P(B|N)P(N) = 0.1(0.4) = 0.04

18

Bayes’ Theorem

S NC P(C,S) P(C,N) P(C) B P(B,S) P(B,N) P(B) P(S) P(N)

P(S|C) = P(C,S) / P(C)= P(C|S)P(S ) / P(C)

P(S|C) P(C|S)P(S )

19

Joint probability Table

S NC 0.12 0.36 0.48B 0.48 0.04 0.52 0.6 0.4

P(S|C) = 0.12/0.48 = 0.25 posterior (conditional) probability

Compare to P(S) = 0.6 prior (marginal) prob.

P(S) decreased because of C forecast

20

Prior and Posterior Probabilities

Prior means before.

Prior probability is the probability P(S) before the information was heard.

Posterior means after.

It is probability obtained after incorporating the new forecast information. It is P(S|C).

It is obtained using Bayes’ theorem.

21

Example with 3 states

3 demand possibilities for new product High P(H) = 0.6 Medium P(M) = 0.1 Low P(L) = 0.3

Market research gives Average result: 30% of time if true demand is High 50% of time if true demand is Medium 90% of time if true demand is Low

22

Example : Probability Table for Average result

State Prior Likelihood Joint PosteriorS P(S) P(A|S) P(S,A) P(S|A)H 0.6 0.3 0.18 0.36M 0.1 0.5 0.05 0.10L 0.3 0.9 0.27 0.54 1.0 0.50 1.00

If market research gives Average result:P(H), P(L),P(M)

23

Ex: Sequential Bayesian Analysis

An oil company has 3 drilling sites: X, Y, Z

3 possible reserve states: No reserves P(N) = 0.5 Small reserves P(S) = 0.3 Large reserves P(L) = 0.2

3 possible drilling outcomes: Dry D Wet W Gushing G

24

Ex: Sequential Bayesian Analysis

If reserves are: None (N) all wells will be dry (D) Large (L) all wells will be Gushing (G) Small (S) some Dry (D) and some wet (W)

wells P(1D/1) = 0.8P(2D/2) = 0.2P(3D/3) = 0

All sites are equally favorable Assume order of drilling is: XYZ Notation: DX = probability of Dry well at site X

25

Ex: Probability Table for Site X

State Prior Conditional Joint

K P(K) P(D|K) P(W|K) P(G|K) DX WX GX

N 0.5 1 0 0 0.5 0 0S 0.3 0.8 0.2 0 0.24 0.06 0L 0.2 0 0 1 0 0 0.2 1.0 0.74 0.06 0.2

Stop exploratory drilling at X if you get: W: Reserves are S, or G: Reserves are L

If you get D, drill at Y..(Res. N: P = 0.5/0.74 = 0.68, S: P = 0.24/0.74 = 0.32)

26

Ex: Probability Table for Site Y


K P(K) P(D|K) P(W|K) DY WY

N 0.68 1 0 0.68 0S 0.32 0.5 0.5 0.16 0.16 = 0.4/0.8 1.0 0.84 0.16

Stop exploratory drilling at Y if you get W: Reserves are S

If you get D, drill at Z(Res. N: P = 0.68/0.84 = 0.81, S: P = 0.16/0.84 = 0.19)

27

Ex: Probability Table for Site Z


K P(K) P(D|K) P(W|K) DZ WZ

N 0.81 1 0 0.81 0S 0.19 0 1 0 0.19 1.0 0.81 0.19

If you get W: Reserves are S

If you get D:Reserves are N

28

Ex: Change in P(N)

Prior Probability Before drilling P(N) = 0.5

Posterior Probabilities After 1 Dry P(N|Dx) = 0.68

After 2 Dry P(N|Dx, DY) = 0.81

After 3 Dry P(N|Dx, DY, DZ) = 1

29

7.4. Conditional Independence

Two events, A and B, are independent iff: P(A, B) = P(A)*P(B)Implying P(A|B) = P(A) P(B|A) = P(B)

Posterior probability is the same the prior New information about 1 event does not

affect the probability of the other

30

Conditional Independence

Two events, A and B, are conditionally independent iff:

P(A, B) P(A)*P(B)

But their conditional probabilities on a 3rd event, C, are independent

P(A, B|C) = P(A|C)*P(B|C)

Useful property in Bayesian analysis

31

Ex: Horse Race Probability

Horse named WR will race at 3:00 p.m.

Probability of WR winning P(WR) depends on track condition

Firm (F) P(F) = 0.3 P(WR|F) = 0.9

Soft (S) P(S) = 0.7 P(WR|S) = 0.2

37

7.5. Bayesian Updating with Functional Likelihoods

Posterior probabilityP(A|B) = P(A,B )/P(B)

= P(B|A)P(A)/P(B)

Conditional probability (likelihood) P(B|A) can be described by particular probability distribution:• (1) Binomial• (2) Normal

38

Binomial likelihood distribution

Dichotomous (2-value) data: defective, not Sequence of dichotomous outcomes: series of

quality tests In each test, constant probability (p) of one of the

2 outcomes: defective Outcome of each test is independent of others Total number (r) of one kind of outcomes

(defective) out of (n) tests is

f(r|n, p) = Given in tables rnr pp

r

n

)1(

39

Binomial likelihood exampleDemand for new product can be: High (H) P(H) = 0.2 Medium (M) P(M) = 0.3 Low (L) P(L) = 0.5

For each case, the probability (p) that an individual customer buys the product is

H: p = 0.25 M: p = 0.1 L: p = 0.05

In random sample of 5 customers, 1 will buy

40

Binomial likelihood example

For (n = 5, r = 1), likelihoods are obtained from table, or calculated by:

H: [5!/(4!*1)]0.25(0.75)4 = 0.3955 M: [5!/(4!*1)]0.1(0.9)4 = 0.3281 L: [5!/(4!*1)]0.05(0.95)4 = 0.2036

The joint probability table can now be constructed

151 )1(1

5

pp

41

Binomial Likelihood Example

State Prior Likelihood Joint Posterior

p P(p) P(1|5,p) P(p,1/5) P(p|1/5)H: 0.25 0.2 0.3955 0.0791 0.2832M: 0.1 0.3 0.3281 0.0984 0.3523L: 0.05 0.5 0.2036 0.1018 0.3645 1.0 0.2793 1.00

If 1 in a sample of 5 customers buys:P(H), P(M), P(L)

42

Normal likelihood distribution

Most common, symmetric, Continuous data, can approximate discrete

f(y|, ) =

Given in tables. Formula usually not used.

Two parameters: mean () and standard deviation ().

2/12

2

)2(

}]/)[(5.0exp{

y

43

Normal likelihood Updating

Mean () and standard deviation () Can be updated individually (assuming one is known) or together

P(|y, ) f(y|, ) P(|) P(|y, ) f(y|, ) P(|) P(, |y) f(y|, ) P(, )

44

Normal likelihood example

Updating Mean ()

Average weight setting has 2 possibilities:

High (H) P(H) = 0.5 = 8.2, = 0.1 Low (L) P(L) = 0.5 = 7.9, = 0.1

A sample of 1 bottle has weight = 8.0 oz.What is the posterior probability of H and L?

45

Normal likelihood example

Likelihood values are obtained from table, or calculated by

If = 8.2Z = (8.0 – 8.2)/0.1 = – 2f(8|, ) = 0.054

If = 7.9Z = (8.0 – 7.9)/0.1 = 1f(8|, ) = 0.242

2/1

2

)]01.0(2[

}]1.0/)8[(5.0exp{

46

Normal Likelihood Example

State Prior Likelihood Joint Posterior

P() P(8|, ) P(8,,) P( |8)

H: 8.2 0.5 0.054 0.027 0.18L: 7.9 0.5 0.242 0.121 0.82

1.0 0.148 1.00

After a sample of 1 bottle with weight = 8:P(H), P(L)

1 CHAPTER 7 Subjective Probability and Bayesian Inference.

Documents

Transcript of 1 CHAPTER 7 Subjective Probability and Bayesian Inference.