Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer...

28
Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University

description

3 Bayesian Decision Making (1/2) Male/Female Classification Given a priori data of pairs, (168, m ) (146, f ) (173, m ) (160, f ) (157, m ) (156, f ) (163, m ) (159, f ) (162, m ) (149, f ) What is the sex of a people whose height is 160?

Transcript of Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer...

Page 1: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

Introduction to Probability and Bayesian

Decision Making

Soo-Hyung KimDepartment of Computer Science

Chonnam National University

Page 2: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

2

Bayesian Decision Making Definition of Probability Conditional Probability Bayes’ Theorem Probability Distribution Gaussian Random Variable Naïve Bayesian Decision References

Page 3: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

3

Bayesian Decision Making (1/2)

Male/Female Classification Given a priori data of <height, sex> pairs,

(168, m) (146, f)(173, m) (160 , f)(157, m) (156 , f)(163, m) (159 , f)(162, m) (149 , f)

What is the sex of a people whose height is 160?

?)160|(?)160|(

heightfPheightmP

Page 4: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

4

Bayesian Decision Making (2/2)

UCI-Iris Data Classification Given a dataset of 150 tuples of <l1, w1, l2, w2,

class> 4 numeric attributes

Min Max Mean SD Correlation sepal length: 4.3 7.9 5.84 0.83 0.7826 sepal width: 2.0 4.4 3.05 0.43 -0.4194 petal length: 1.0 6.9 3.76 1.76 0.9490 petal width: 0.1 2.5 1.20 0.76 0.9565

3 types of class: Iris Setosa, Iris Versicolour, Iris Virginica What is the class of the data <5.1, 3.0, 4.9, 0.5>?

?)5.0,9.4,0.3,1.5 | (?)5.0,9.4,0.3,1.5 |(?)5.0,9.4,0.3,1.5 | (

VirginicaPrVersicolouP

SetosaP

Page 5: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

5

UCI-Iris Data

Page 6: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

6

Definition of Probability (1/4)

Experiment & Probability Experiment = procedure + observation Sample: a possible outcome of an experiment Sample space: set of all samples

= { s1, s2, …, sN } Event: set of samples (or subset of , A )

Probability: a value associated with an event, P(A)

Page 7: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

7

Definition of Probability (2/4)

Various Definitions of Probability , where samples are all equally

likely

Axiomatic Model

nnANAP

n

),(lim)(

||||)(

AAP

Page 8: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

8

Definition of Probability (3/4)

Probability Axioms (A. N. Kolmogorov)1. For any event A, P(A) 02. P() = 13. For any countable collection A1, A2, … of

mutually exclusive events, P(A1 A2 … ) = P(A1) + P(A2) + …

Page 9: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

9

Definition of Probability (4/4)

Properties of Probability P() =0 P(Ac) = 1 – P(A) P(A B) = P(A) + P(B) – P(A,B)

If A & B are mutually exclusive, P(A B) = P(A) + P(B)

If A B, P(A) P(B)

Page 10: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

10

Conditional Probability (1/3)

Prob. of event A given the occurrence of event B

Independence: if A & B are independent events,

)(),(

)()(

)()()|(

BPABP

BPBAP

BPABPBAP

)|()(),()|()(),( ABPAPBAPBAPBPABP

)()|(),()|( BPABPAPBAP

)()(),( BPAPBAP

),,( CBAP ),|()|()( BACPABPAP

)()()( CPBPAP),,( CBAP

Page 11: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

11

Conditional Probability (2/3)

Properties of P(A|B)1. For any event A & B, P(A|B) 02. P(B|B) = 13. If A=A1 A2 … where A1, A2, … are mutually

exclusive, P(A|B) = P(A1|B) + P(A2|B) + …

Page 12: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

12

Conditional Probability (3/3)

Total Probability Law Event space: set { B1, B2, …, Bm } of events

which are mutually exclusive: Bi Bj = , i j collectively exhaustive: B1 B2 … Bn =

For an event space { B1, B2, …, Bm } with P(Bi)>0, ii BAC

m

iii

m

ii BAPBPBAPAP

11

)|()()()(

Page 13: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

13

Bayes’ Theorem (1/2)

From the definition of conditional probability,

If the set {C1, C2, …, Cm } is an event space then, from the total probability law,

)(/)|()()|()|()()|()(),(

APCAPCPACPACPAPCAPCPACP

iii

iiii

m

iii

iiiii

CPCAP

CPCAPAPCPCAPACP

1

)()|(

)()|()(

)()|()|(

m

iiiiii CAPCPCAPCPACP

1

)|()(/)|()()|(

Page 14: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

14

Bayes’ Theorem (2/2) Posterior Probability

Example Application A: 기침 질병의 집합 : C = {C1( 독감 ), C2( 고지혈증 ), …, Cm( 폐

암 )} 기침하는 환자가 어떤 질병에 걸렸는지 판단

P( 독감 | 기침 ), P( 고지혈증 | 기침 ), …, P( 폐암 | 기침 ) Generalization

evidenceprior likelihood posterior

)(

)()|()|(APCPCAPACP ii

i

)()|,,,(),,,|( 2121 iinni CPCAAAPAAACP

Page 15: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

15

Probability Distribution Probability Model, P()

A function that assigns a probability to each sample

Histogram Table Mathematical Formula

Page 16: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

16

Random Variable A function that assigns a real value to

each element in sample space () X: si x, where si , xR

If si =aaraaa, X(si ) = 5 (number of a) Prob. Model for a discrete random variable

PK(k): probability mass function (PMF) Prob. Model for a continuous random variable

fX(x): probability density function (PDF)

Page 17: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

17

Cumulative Probability Distribution (CDF)

FR(r) = PR(Rr)

Page 18: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

18

PMF vs PDF PMF: PK(k)=PK(K=k)

PDF: dxxdFxf X

X)()(

)()()()( 12212

1

xFxFdxxfxXxPx

x

x

XX dttfxF )()(

Page 19: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

19

Gaussian Random Variable (1/6)

PDF of a random variable X has a form of

is an average; is a standard deviation ( >0)

Page 20: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

20

Gaussian Random Variable (2/6)

Example #1: 10 pairs of <height, sex>(168, m) (146, f)(173, m) (160 , f)(157, m) (156 , f)(163, m) (159 , f)(162, m) (149 , f)

MLE for the PDF of H(Height) for the class m 6.1645/)162163157173168( H

5.551 2

5

1

2

Hi

iH height

22 )5.5(2/)6.164(

2)5.5(2

1)|( heightH emheightf

Page 21: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

21

Gaussian Random Variable (3/6)

MLE for the PDF of H(Height) for the class f

Classification of a people whose height is 160

Classify the data into male (with a probability of 0.59)

59.0)()|160()()|160(

)()|160(

)160()()|160()160|(

fPfheightfmPmheightfmPmheightf

heightPmPmheightfheightmP

HH

H

H

22 )6.5(2/)0.154(

2)6.5(21)|( height

H efheightf

41.0)160|(1)160|( heightmPheightfP

Page 22: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

22

154 164 160

Page 23: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

23

Gaussian Random Variable (4/6)

PDF of a n-D random vector X has a form of

X is an average vector; CX is a covariance matrix

where cij = Cov(xi, xj) = E(xixj) –ij

nx

x1

X

n

1

ijcXC

Page 24: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

24

Gaussian Random Variable (5/6)

Example #2: UCI–Iris data Learning Phase

MLE of PDF for a 4-D R.V. X for individual classes Using a part of the data (e.g., 30 out of 50

samples)

Generalization (Testing) Phase Using a sample which are not used in learning If classify x into the class having the

maximum posterior probability

)|( ),|( ),|( VerginicafrVersicoloufSetosaf xxx XXX

)|( ),|( ),|( xxx VirginicaPrVersicolouPSetosaP

5.0,9.4,0.3,1.5x

Page 25: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

25

Gaussian Random Variable (6/6)

)()|()()|()()|()()|()|(

)()|()()|()()|()()|()|(

)()|()()|()()|()()|()|(

VirginicafVirginicafrVersicoloufrVersicoloufSetosafSetosafVirginicafVirginicafVirginicaP

VirginicafVirginicafrVersicoloufrVersicoloufSetosafSetosafrVersicoloufrVersicoloufrVersicolouP

VirginicafVirginicafrVersicoloufrVersicoloufSetosafSetosafSetosafSetosafSetosaP

xxxxx

xxxxx

xxxxx

XXX

X

XXX

X

XXX

X

Page 26: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

26

Naïve Bayesian Decision Accuracy of Bayesian Decision depends on

Independence assumption can make it!)|()|()|()|( 21 iniii CxPCxPCxPCP x

),,|(),|()|()|,,()|( 2131211 xxCxPxCxPCxPCxxPCP iiiini x

Page 27: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

27

UCI Data

Page 28: Introduction to Probability and Bayesian Decision Making Soo-Hyung Kim Department of Computer Science Chonnam National University.

28

References Textbooks

R.D. Yates and D.J. Goodman, Probability and Stochastic Processes, 2nd ed., Wiley, 2005.

송홍엽 , 정하봉 , 확률과 랜덤변수 및 랜덤과정 , 교보문고 , 2006.

R.E. Walpole, et. al., Probability and Statistics for Engineers and Scientist, 7th ed., Prentice Hall, 2002.

W. Mendelhall, Probability and Statistics, 12th ed., Thomson Brooks/Cole, 2006.

신양우 , 기초확률론 , 경문사 , 2000. http://www.ics.uci.edu/~mlearn/MLRepository.html