Pattern Recognition : Feature Representation x Decision Function d(x)

32
Pattern Recognition : Feature Representation x Decision Function d(x) Geometric Interpretation : Decision Boundary Neural Network Conventional : Statistical Formulation - Bayes : Optimal Syntactic Approach McCulloch–Pitts Neuronal Model : Threshold Logic Sigmoid Neuronal Model : Chap 3 Digital Logic Single Layer Network Multilaye r Network Geometric interpretation (Surface) Ch1 – Concept Map → Can represent any region in space

description

Ch1 – Concept Map. Pattern Recognition : Feature Representation x Decision Function d(x). Geometric Interpretation : Decision Boundary. (Surface). Neural Network. Conventional : Statistical Formulation - Bayes : Optimal Syntactic Approach. - PowerPoint PPT Presentation

Transcript of Pattern Recognition : Feature Representation x Decision Function d(x)

Page 1: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

Pattern Recognition :

Feature Representation x

Decision Function d(x)Geometric Interpretation :

Decision Boundary

Neural NetworkConventional :

Statistical Formulation

- Bayes : Optimal

Syntactic ApproachMcCulloch–Pitts

Neuronal Model :

Threshold Logic

Sigmoid

Neuronal Model : Chap 3

Digital LogicSingle Layer

Network

Multilayer

Network

Geometric interpretation

(Surface)

Ch1 –

Concept Map

→ Can represent any region in space

Page 2: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)
Page 3: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)
Page 4: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)
Page 5: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)
Page 6: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)
Page 7: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)
Page 8: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

Chapter 1. Pattern Recognition and Neural Networks

1. Pattern Recognition

Two ObjectivesClass Recognition : Image -> Apple

Attribute Recognition – Image -> Color, Shape, Taste, etc.

(1) Approachesa. Template Matchingb. Statistical c. Syntactic

d. Neural Network

Ex. Color (attribute) of apple is red.

Class 2Class 1

)( 2 x ωP)( 1 xω P

Statistical

x

Bayes Optimal Decision Boundary

Ex. x = temperature, ω1: healthy, ω2 : sick. x = (height, weight), ω1: female, ω2 : male.

Page 9: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

(2) Procedure - Train and Generalize

Preprocessing Feature Extraction Discriminant Decision making ClassRaw

Data

For data reduction,better separation

x )(xd

Eliminate bad data (outliers)Filter out noise

Training Data = Labelled Input/Output Data = { x | d(x) is known }

Page 10: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)
Page 11: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

(3) Decision ( Discriminant ) Function

a. 2-Class Weather Forecast Problem n = 2, M = 2

32211)( wxwxwd x

32211 wxwxw

unnormalized

normalized

20)( xxd 10)( xxd

xwwx ~

11

)( T2

1T

2

1

321

x

x

Dx

x

d o

0)( xd

1x

2x

12

decision boundary = ( n-1) dim. = line, plane, hyperplane.

=

22

21

32211

ww

wxwxw

w1 w2 w3

x1 = temp, x2 = pressure

Page 12: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

.

. ,

.

.2

1

2

1

nn

o

x

x

x

w

w

w

xw

Dd o xwx T)(

Dn 1

In general,

xw T

0

D xw T

00

0T

0xw

D

x

0w

Dxw T

0

0T

0

T

0

xw

xw D

ow is a unit normal to the Hyperplane.

Page 13: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

b. Case of M = 3 , n = 2 – requires 3 discriminants

IR1 2d

3 1d

3 1d2 3d

1

3

2

Pairwise Separable- Discriminants Req.

1

2

3

12d 23d 31d

-

- + 

 

- +

2CM

Page 14: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

133 1 ddd

322 3 ddd

211 2 ddd

1

3

2

1d 2d 3d

1

2

3

Max   

 

Max 

   

Max

Linear classifier Machinexwxwpxg T

iii )|()( cci xxgxg )()(max

Page 15: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

01 d

023 d

13

2

IR

1d

d2

3d

1 3

2

Page 16: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

2. PR – Neural Network Representation

(1) Models of a Neuron

A. McCulloch-Pitts Neuron – threshold logic with fixed weights

1x

2x

px

1w

2w

pw

1x

2x

px

1

= bias

y

NonlinearActivation Function

Adaptive LinearCombiner

(Adaline)

ii

ii

xw

xwy

0

1

))(xdy

ii xwxd )(

Page 17: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

B. Generalized Model

Two-Sided

One-Sided

Hard Limiter, Threshold Logic

1x

2x

+

1

2

bias

Half Plane Detector

0θxx 2211

1x

2x

1

2

bias

?

Ramp Logistic

Signum, Threshold Logic Ramp tanh

Binary

Bipolar

Piecewise Linear

Page 18: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

(2) Boolean Function Representation Examples

1x

2x

AND1.5

1x INVERTER-0.5-1

MEMORY0.5-1

11Excitatory

Inhibitory

1x

2x

NAND-1.5

-1

-1

1x

2x

OR0.5

1x

2x

NOR-0.5`

-1

-1

x , x = binary (0, 1)1 2

Cf. x , x may be bipolar ( -1, 1) → Different Bias will be needed above.1 2

Page 19: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

1x

2x

321

1 0 0

0 1 0

0 0 1

A. Single Layer with Single Output = Fire within a Half Plane for a 2-Class Case

(3) Geometrical Interpretation

B. Single Layer Multiple Outputs – for a Multi-Class Case

Page 20: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

Linearly Nonseparable

0

-5

3

1

x 1

x2

x1 x 2

Nonlinearly Separable

C. Multilayer with Single Output - XOR

ON OFF

→ Bipolar Inputs,

Other weights needed for Binary Representation.

Page 21: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

1x

2x

XOR

0.5

0.5

0.5

- 1

- 1

11

a. Successive Transforms

OR

21 xx

21 xx

1x

2x )( 21 xx

)(21

xx

21x =0.5 x

21x =0.5 x

1x

2x

XOR

-1.5

0.5

1.5

- 1

1

1- 1

1

1

AND

OR

NAND

Page 22: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

1x

2x

0.5

1.5

0.5

1

1

111-bit parity

1x

2x

0.5

1.5

0.5 n-bit parity

nx n-0.5

1

-1

1

-1

(-1) n+1

b. XOR = OR ─ (1,1) AND

c. Parity

1

1

1.5 0.5-2

1

1

1x

2x

ORAND

= XOR

Page 23: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

D. Multilayer with Single Output – Analog Inputs (1/2)

1

2

1

2

3

OR OR

1

2

1

2 3

1

2

3

1

2

3

AND

Page 24: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

E. Multilayer Single Output – Analog Inputs – (2/2)

1 2

3

4

5

6

1

23

4

56

1

2

3AND

OR

4

5

6

1

2

3

4

5

6

AND

AND

OR

Page 25: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

F.

Page 26: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)
Page 27: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

23年 4月 20日 27

MLP Decision BoundariesMLP Decision Boundaries

A B

B A

A

B

XOR Interwined General

1-layer: Half planes

A B

B A

A

B

2-layer: Convex

A B

B A

A

B

3-layer: Arbitrary

Page 28: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)
Page 29: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

0 1 x

x

1

2

1 ②

0 1 x

x

1

2

1

0 1 x

x

1

2

1

2

Exercise :Transform of NN from to : See how the weights are changed.① ②

0 1 2 3 x

②①

x

1

2

1

Page 30: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)
Page 31: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)

Questions from Students -05

How to learn the weights ? Any analytic and systematic

methodology to find the weights ? Why do we use polygons to

represent the active regions [1 output] ?

Why should di(x) be the maximum for the Class i ?

Page 32: Pattern Recognition :   Feature Representation  x   Decision Function          d(x)