Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 ·...

17
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, 2014 [email protected] http://www.cmpe.boun.edu.tr/~ethem/i2ml3e Lecture Slides for

Transcript of Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 ·...

Page 1: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

INTRODUCTION

TO

MACHINE

LEARNING 3RD EDITION

ETHEM ALPAYDIN

© The MIT Press, 2014

[email protected]

http://www.cmpe.boun.edu.tr/~ethem/i2ml3e

Lecture Slides for

Page 2: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

CHAPTER 3:

BAYESIAN DECISION

THEORY

Page 3: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

Probability and Inference 3

Result of tossing a coin is {Heads,Tails}

Random var X {1,0}

Bernoulli: P {X=1} = poX (1 ‒ po)

(1 ‒ X)

Sample: X = {xt }Nt =1

Estimation: po = # {Heads}/#{Tosses} = ∑t xt / N

Prediction of next toss:

Heads if po > ½, Tails otherwise

Page 4: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

Classification

Credit scoring: Inputs are income and savings.

Output is low-risk vs high-risk

Input: x = [x1,x2]T ,Output: C Î {0,1}

Prediction:

otherwise 0

)|()|( if 1 choose

or

otherwise 0

)|( if 1 choose

C

C

C

C

,xxCP ,xxCP

. ,xxCP

2121

21

01

501

4

Page 5: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

Bayes’ Rule

x

xx

p

pPP

CCC

| |

110

0011

110

xx

xxx

||

||

CC

CCCC

CC

Pp

PpPpp

PP

5

posterior

likelihood prior

evidence

Page 6: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

Bayes’ Rule: K>2 Classes

K

kkk

ii

iii

CPCp

CPCp

p

CPCpCP

1

|

|

||

x

x

x

xx

xx | max | if choose

and 1

kkii

K

iii

CPCPC

CPCP

10

6

Page 7: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

Losses and Risks

Actions: αi

Loss of αi when the state is Ck : λik

Expected risk (Duda and Hart, 1973)

xx

xx

|min| if choose

||

kkii

k

K

kiki

RR

CPR

1

7

Page 8: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

Losses and Risks: 0/1 Loss

ki

kiik

if

if

1

0

x

x

xx

|

|

||

i

ikk

K

kkiki

CP

CP

CPR

1

1

8

For minimum risk, choose the most probable class

Page 9: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

Losses and Risks: Reject

10

1

1

0

otherwise

if

if

,Ki

ki

ik

xxx

xx

|||

||

iik

ki

K

kkK

CPCPR

CPR

1

1

1

otherwise reject

| and || if choose 1xxx ikii CPikCPCPC

9

Page 10: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

Different Losses and Reject 10

Equal losses

Unequal losses

With reject

Page 11: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

Discriminant Functions

Kigi ,, , 1x xx kkii ggC max if choose

xxx kkii gg max| R

ii

i

i

i

CPCp

CP

R

g

|

|

|

x

x

x

x

11

K decision regions R1,...,RK

Page 12: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

K=2 Classes

Dichotomizer (K=2) vs Polychotomizer (K>2)

g(x) = g1(x) – g2(x)

Log odds:

otherwise

if choose

2

1 0

C

gC x

x

x

|

|log

2

1

CP

CP

12

Page 13: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

Utility Theory

Prob of state k given exidence x: P (Sk|x)

Utility of αi when state is k: Uik

Expected utility:

xx

xx

| max| if Choose

||

jj

ii

kkiki

EUEUα

SPUEU

13

Page 14: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

Association Rules

Association rule: X Y

People who buy/click/visit/enjoy X are also likely to

buy/click/visit/enjoy Y.

A rule implies association, not necessarily causation.

14

Page 15: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

Association measures 15

Support (X Y):

Confidence (X Y):

Lift (X Y):

customers

and bought whocustomers

#

#,

YXYXP

X

YX

XP

YXPXYP

bought whocustomers

and bought whocustomers

|

#

#

)(

,

)(

)|(

)()(

,

YP

XYP

YPXP

YXP

Page 16: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

Example 16

Page 17: Introduction to Machine Learning - CmpE WEBethem/i2ml3e/3e_v1-0/i2ml3... · 2016-08-08 · Introduction to Machine Learning Author: ethem Created Date: 7/8/2014 1:29:59 PM ...

Apriori algorithm (Agrawal et al.,

1996) 17

For (X,Y,Z), a 3-item set, to be frequent (have

enough support), (X,Y), (X,Z), and (Y,Z) should be

frequent.

If (X,Y) is not frequent, none of its supersets can be

frequent.

Once we find the frequent k-item sets, we convert

them to rules: X, Y Z, ...

and X Y, Z, ...