Machine Learning: Exercise Sheet 5 -...

33
Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen und Nat¨ urlichsprachliche Systeme Albert-Ludwigs-Universit¨ at Freiburg [email protected] Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 (1)

Transcript of Machine Learning: Exercise Sheet 5 -...

Page 1: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Machine Learning: Exercise Sheet 5

Manuel BlumAG Maschinelles Lernen und Naturlichsprachliche Systeme

Albert-Ludwigs-Universitat Freiburg

[email protected]

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 (1)

Page 2: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 1: k-Nearest Neighbor Classifier

A medical expert is going to build up a case-based reasoning system fordiagnosis tasks. Cases correspond to individual persons where the case problemparts are made up of a number of features describing possible symptoms andthe solution parts represent the diagnosis (classification of disease). The casebase contains the seven cases provided in the table below.

Training Fever Vomiting Diarrhea Shivering Classificationc1 no no no no healty (H)c2 average no no no influenza (I)c3 high no no yes influenza (I)c4 high yes yes no salmonella poisoning (S)c5 average no yes no salmonella poisoning (S)c6 no yes yes no bowel inflammation (B)c7 average yes yes no bowel inflammation (B)

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (2)

Page 3: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 1: k-Nearest Neighbor Classifier

Moreover, the expert has specified a similarity measure reflecting his expertise,using local similarity measures and feature weights as specified in the figurebelow.

(a) Calculate the similarity between all cases from the case base and the queryq = (high, no, no, no).

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (3)

Page 4: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 1: k-Nearest Neighbor Classifier

(a) Calculate the similarity between all cases from the case base and the queryq = (high, no, no, no).

Case Representation

I Here, each case ci = (p, s) consists of a problem part p and solution part s.

I An attribute-value-based case representation is used withp = (pF , pV , pD , pSh) and aF ∈ {no, average, high} andaV , aD , aSh ∈ {yes, no}. For the solution part it holds s ∈ {H, I ,S ,B}(classification with 4 classes).

I Thus, q = (qF , qV , qD , qSh) = (high, no, no, no).

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (4)

Page 5: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 1: k-Nearest Neighbor Classifier

(a) Calculate the similarity between all cases from the case base and the queryq = (high, no, no, no).

Similarity Assessment

I Similarity between q and all c ∈ CB = {ci |i = 1, . . . , 7} must bedetermined.

I It holds for all c ∈ CB

Sim(q, c) =

∑a∈{F ,V ,D,Sh} wa · sima(qa, c.pa)∑

a∈{F ,V ,D,Sh} wa

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (5)

Page 6: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 1: k-Nearest Neighbor Classifier

(a) Calculate the similarity between all cases from the case base and the queryq = (high, no, no, no).

Similarity Assessment

I Note that here the weights have already been normalized (sum up to 1)which is why the expression for the global weighted similarity Sim(q, c)becomes

Sim(q, c) =∑

a∈{F ,V ,D,Sh}

wa · sima(qa, c.pa)

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (6)

Page 7: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 1: k-Nearest Neighbor Classifier

(a) Calculate the similarity between all cases from the case base and the queryq = (high, no, no, no).

I for c1 = ((no, no, no, no),H):Sim(q, c1) = 0.3 · 0.0 + 0.2 · 1.0 + 0.2 · 1.0 + 0.3 · 1.0 = 0.70

I for c2 = ((average, no, no, no), I ):Sim(q, c2) = 0.3 · 0.3 + 0.2 · 1.0 + 0.2 · 1.0 + 0.3 · 1.0 = 0.79

I for c3 = ((high, no, no, yes), I ):Sim(q, c3) = 0.3 · 1.0 + 0.2 · 1.0 + 0.2 · 1.0 + 0.3 · 0.2 = 0.76

I for c4 = ((high, yes, yes, no),S):Sim(q, c4) = 0.3 · 1.0 + 0.2 · 0.2 + 0.2 · 0.2 + 0.3 · 1.0 = 0.68

I for c5 = ((average, no, yes, no),S):Sim(q, c5) = 0.3 · 0.3 + 0.2 · 1.0 + 0.2 · 0.2 + 0.3 · 1.0 = 0.63

I for c6 = ((no, yes, yes, no),B):Sim(q, c6) = 0.3 · 0.0 + 0.2 · 0.2 + 0.2 · 0.2 + 0.3 · 1.0 = 0.28

I for c7 = ((average, yes, yes, no),B):Sim(q, c7) = 0.3 · 0.3 + 0.2 · 0.2 + 0.2 · 0.2 + 0.3 · 1.0 = 0.47

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (7)

Page 8: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 1: k-Nearest Neighbor Classifier

(a) Calculate the similarity between all cases from the case base and the queryq = (high, no, no, no).

I for c1 = ((no, no, no, no),H):Sim(q, c1) = 0.3 · 0.0 + 0.2 · 1.0 + 0.2 · 1.0 + 0.3 · 1.0 = 0.70

I for c2 = ((average, no, no, no), I ):Sim(q, c2) = 0.3 · 0.3 + 0.2 · 1.0 + 0.2 · 1.0 + 0.3 · 1.0 = 0.79

I for c3 = ((high, no, no, yes), I ):Sim(q, c3) = 0.3 · 1.0 + 0.2 · 1.0 + 0.2 · 1.0 + 0.3 · 0.2 = 0.76

I for c4 = ((high, yes, yes, no),S):Sim(q, c4) = 0.3 · 1.0 + 0.2 · 0.2 + 0.2 · 0.2 + 0.3 · 1.0 = 0.68

I for c5 = ((average, no, yes, no),S):Sim(q, c5) = 0.3 · 0.3 + 0.2 · 1.0 + 0.2 · 0.2 + 0.3 · 1.0 = 0.63

I for c6 = ((no, yes, yes, no),B):Sim(q, c6) = 0.3 · 0.0 + 0.2 · 0.2 + 0.2 · 0.2 + 0.3 · 1.0 = 0.28

I for c7 = ((average, yes, yes, no),B):Sim(q, c7) = 0.3 · 0.3 + 0.2 · 0.2 + 0.2 · 0.2 + 0.3 · 1.0 = 0.47

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (7)

Page 9: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 1: k-Nearest Neighbor Classifier

(a) Calculate the similarity between all cases from the case base and the queryq = (high, no, no, no).

I for c1 = ((no, no, no, no),H):Sim(q, c1) = 0.3 · 0.0 + 0.2 · 1.0 + 0.2 · 1.0 + 0.3 · 1.0 = 0.70

I for c2 = ((average, no, no, no), I ):Sim(q, c2) = 0.3 · 0.3 + 0.2 · 1.0 + 0.2 · 1.0 + 0.3 · 1.0 = 0.79

I for c3 = ((high, no, no, yes), I ):Sim(q, c3) = 0.3 · 1.0 + 0.2 · 1.0 + 0.2 · 1.0 + 0.3 · 0.2 = 0.76

I for c4 = ((high, yes, yes, no),S):Sim(q, c4) = 0.3 · 1.0 + 0.2 · 0.2 + 0.2 · 0.2 + 0.3 · 1.0 = 0.68

I for c5 = ((average, no, yes, no),S):Sim(q, c5) = 0.3 · 0.3 + 0.2 · 1.0 + 0.2 · 0.2 + 0.3 · 1.0 = 0.63

I for c6 = ((no, yes, yes, no),B):Sim(q, c6) = 0.3 · 0.0 + 0.2 · 0.2 + 0.2 · 0.2 + 0.3 · 1.0 = 0.28

I for c7 = ((average, yes, yes, no),B):Sim(q, c7) = 0.3 · 0.3 + 0.2 · 0.2 + 0.2 · 0.2 + 0.3 · 1.0 = 0.47

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (7)

Page 10: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 1: k-Nearest Neighbor Classifier

(b) Calculate the similarity between all cases from the case base and the query(?, yes, no, yes) where the question mark indicates that the value of thesymptom fever has not been determined, yet.

I The calculation of Sim(q, c) now disregards the unknown attribute Fever ,giving rise to

Sim(q, c) =

∑a∈{V ,D,Sh} wa · sima(qa, c.pa)∑

a∈{V ,D,Sh} wa

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (8)

Page 11: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 1: k-Nearest Neighbor Classifier

We obtain for q = (?, yes, no, yes) and

I for c1 = ((no, no, no, no),H):Sim(q, c1) = 2

7· 0.0 + 2

7· 1.0 + 3

7· 0.0 = 2

7≈ 0.29

I for c2 = ((average, no, no, no), I ):Sim(q, c2) = 2

7· 0.0 + 2

7· 1.0 + 3

7· 0.0 = 2

7≈ 0.29

I for c3 = ((high, no, no, yes), I ):Sim(q, c3) = 2

7· 0.0 + 2

7· 1.0 + 3

7· 1.0 = 5

7≈ 0.71

I for c4 = ((high, yes, yes, no),S):Sim(q, c4) = 2

7· 1.0 + 2

7· 0.2 + 3

7· 0.0 ≈ 0.34

I for c5 = ((average, no, yes, no),S):Sim(q, c5) = 2

7· 0.0 + 2

7· 0.2 + 3

7· 0.0 ≈ 0.06

I for c6 = ((no, yes, yes, no),B):Sim(q, c6) = 2

7· 1.0 + 2

7· 0.2 + 3

7· 0.0 ≈ 0.34

I for c7 = ((average, yes, yes, no),B):Sim(q, c7) = 2

7· 1.0 + 2

7· 0.2 + 3

7· 0.0 ≈ 0.34

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (9)

Page 12: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 1: k-Nearest Neighbor Classifier

We obtain for q = (?, yes, no, yes) and

I for c1 = ((no, no, no, no),H):Sim(q, c1) = 2

7· 0.0 + 2

7· 1.0 + 3

7· 0.0 = 2

7≈ 0.29

I for c2 = ((average, no, no, no), I ):Sim(q, c2) = 2

7· 0.0 + 2

7· 1.0 + 3

7· 0.0 = 2

7≈ 0.29

I for c3 = ((high, no, no, yes), I ):Sim(q, c3) = 2

7· 0.0 + 2

7· 1.0 + 3

7· 1.0 = 5

7≈ 0.71

I for c4 = ((high, yes, yes, no),S):Sim(q, c4) = 2

7· 1.0 + 2

7· 0.2 + 3

7· 0.0 ≈ 0.34

I for c5 = ((average, no, yes, no),S):Sim(q, c5) = 2

7· 0.0 + 2

7· 0.2 + 3

7· 0.0 ≈ 0.06

I for c6 = ((no, yes, yes, no),B):Sim(q, c6) = 2

7· 1.0 + 2

7· 0.2 + 3

7· 0.0 ≈ 0.34

I for c7 = ((average, yes, yes, no),B):Sim(q, c7) = 2

7· 1.0 + 2

7· 0.2 + 3

7· 0.0 ≈ 0.34

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (9)

Page 13: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 1: k-Nearest Neighbor Classifier

We obtain for q = (?, yes, no, yes) and

I for c1 = ((no, no, no, no),H):Sim(q, c1) = 2

7· 0.0 + 2

7· 1.0 + 3

7· 0.0 = 2

7≈ 0.29

I for c2 = ((average, no, no, no), I ):Sim(q, c2) = 2

7· 0.0 + 2

7· 1.0 + 3

7· 0.0 = 2

7≈ 0.29

I for c3 = ((high, no, no, yes), I ):Sim(q, c3) = 2

7· 0.0 + 2

7· 1.0 + 3

7· 1.0 = 5

7≈ 0.71

I for c4 = ((high, yes, yes, no),S):Sim(q, c4) = 2

7· 1.0 + 2

7· 0.2 + 3

7· 0.0 ≈ 0.34

I for c5 = ((average, no, yes, no),S):Sim(q, c5) = 2

7· 0.0 + 2

7· 0.2 + 3

7· 0.0 ≈ 0.06

I for c6 = ((no, yes, yes, no),B):Sim(q, c6) = 2

7· 1.0 + 2

7· 0.2 + 3

7· 0.0 ≈ 0.34

I for c7 = ((average, yes, yes, no),B):Sim(q, c7) = 2

7· 1.0 + 2

7· 0.2 + 3

7· 0.0 ≈ 0.34

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (9)

Page 14: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 1: k-Nearest Neighbor Classifier

(c) Determine the nearest neighbors in (a) and (b) as well as the diagnosis thesystem outputs, when using the k-NN methods with k = 1.

I In task (a) the nearest neighbor is case c2 with Sim(q, c2) = 0.79.

I The corresponding diagnosis of the system is to state that the queryperson suffers from influenza.

I In task (b), where the value of the fever attribute was not known, case c3

achieves maximal similarity (Sim(q, c3) ≈ 0.71).

I Here, the system’s diagnosis is influenza as well.

I Of course, the absence of one (or more) attribute values reduces thereliability of the system’s predictions.

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (10)

Page 15: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 2: Gradient Descent

(a) Execute four iterations of gradient descent with momentum to find the

minimum of the function f (u) = u3

3+ 50u2− 100u− 30. Start with u = 20,

use a learning rate that is set to ε = 0.01, and parameter µ set to 0.1.

I We know that ∂f (u)∂u

= u2 + 100u − 100

1: chose an initial point ~u2: set ~∆← ~0 (step width)3: while ||gradf (~u)|| not close to 0 do

4: ~∆← −ε · gradf (~u) + µ~∆

5: ~u ← ~u + ~∆6: end while7: return ~u

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (11)

Page 16: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 2: Gradient Descent

The function ...

-6e+06

-4e+06

-2e+06

0

2e+06

4e+06

6e+06

8e+06

1e+07

1.2e+07

1.4e+07

-300 -200 -100 0 100 200 300

(x**3)*(1.0/3.0)+50*(x**2)-100*x-30

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (12)

Page 17: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 2: Gradient Descent

The function ... and on [−20, 30]

-10000

0

10000

20000

30000

40000

50000

60000

-20 -10 0 10 20 30

(x**3)*(1.0/3.0)+50*(x**2)-100*x-30

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (13)

Page 18: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 2: Gradient Descent

I extrema at u1 = 0.99 (local minimum) and u2 = −100.99 (localmaximum)

-6e+06

-4e+06

-2e+06

0

2e+06

4e+06

6e+06

8e+06

1e+07

1.2e+07

1.4e+07

-300 -200 -100 0 100 200 300

(x**3)*(1.0/3.0)+50*(x**2)-100*x-30

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (14)

Page 19: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 2: Gradient Descent

I Trace

1. u = 20 uvan = 202. ~∆ = 03. ||gradf (~u)|| = ||202 + 100 · 20− 100|| = 2300

4. ~∆← −ε · gradf (~u) + µ~∆

= −0.01 · 2300 + 0.1 · 0 = −23 ~∆van = −235. ~u ← ~u + ~∆ = 20 + (−23) = −3 uvan = −3

3. ||gradf (~u)|| = ||(−3)2 + 100 · (−3)− 100|| = −391

4. ~∆← −ε · gradf (~u) + µ~∆

= −0.01 · (−391) + 0.1 · (−23) = 1.61 ~∆van = 3.91

5. ~u ← ~u + ~∆ = −3 + 1.61 = −1.39 uvan = −3 + 3.91 = 0.91

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (15)

Page 20: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 2: Gradient Descent

I Trace continued

3. ||gradf (~u)|| = ||(−1.39)2 + 100 · (−1.39)− 100|| = || − 237.1||||gradvanf (~u)|| = gradvanf ( ~0.91)|| = || − 8.0||

4. ~∆← −ε · gradf (~u) + µ~∆

= −0.01 · (−237.1) + 0.1 · 1.61 = 2.53 ~∆van = 0.08

5. ~u ← ~u + ~∆ = −1.39 + 2.53 = 1.14 uvan = 0.99

3. ||gradf (~u)|| = ||1.142 + 100 · 1.14− 100|| = ||15.3||||gradvanf (~u)|| = gradvanf ( ~0.99)|| = ||0|| → end of while loop

4. ~∆← −ε · gradf (~u) + µ~∆= −0.01 · 15.3 + 0.1 · 2.53 = 0.1

5. ~u ← ~u + ~∆ = 1.14 + 0.1 = 1.24

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (16)

Page 21: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 3: Multi-Layer Perceptrons

Examine the multi-layer perceptron given in the figure below with the weightsin the accompanying table.

(a) Both neurons use the logistic activation function (u 7→ 11+e−u ). The

network has a single input variable x and one output variable y . Calculatethe output of both neurons and the error made by the MLP when applyinga pattern with x = 0 and target value 0.5.

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (17)

Page 22: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 3: Multi-Layer Perceptrons

I input: x = 0, target: d = 0.5

I net input of neuron 2: net2 = w2,1 · x + w2,0 = −1 · 0 + 1 = 1

I activation of neuron 2: a2 = fsig (net2) ≈ 0.73

I net input of neuron 3:net3 = w3,1 · x + w3,2 · a2 + w3,0 ≈ 2 · 0 + 1 · 0.73− 2 = −1.27

I output, i.e. activation of neuron 3: y = a3 = fsig (net3) ≈ 0.22

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (18)

Page 23: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 3: Multi-Layer Perceptrons

Examine the multi-layer perceptron given in the figure below with the weightsin the accompanying table.

(b) Calculate the partial derivatives of the error with respect to the weights forthe pattern used in task (a).

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (19)

Page 24: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 3: Multi-Layer Perceptrons

I ∂e∂y

= ∂e∂a3

=∂ 1

2(a3−d)2

∂y= a3 − d ≈ 0.22− 0.5 = −0.28

I ∂e∂net3

= ∂e∂a3· ∂a3∂net3

= ∂e∂a3·a3(1−a3) ≈ −0.28 · (−0.22)(1−0.22)) = −0.048

I ∂e∂a2

= ∂e∂net3

· ∂net3∂a2

= ∂e∂net3

· ∂(a2+2x−2)∂a2

≈ −0.048 · 1 = −0.048

I ∂e∂net2

= ∂e∂a2· ∂a2∂net2

= ∂e∂a2· a2(1− a2) ≈ 0.807 · 0.73(1− 0.73) = −0.0094

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (20)

Page 25: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 3: Multi-Layer Perceptrons

I ∂e∂w3,0

= ∂e∂net3

· ∂net3∂w3,0

= ∂e∂net3

· ∂(w3,0+w3,1·a2+w3,2·x)

∂w3,0= ∂e

∂net3· 1 ≈ −0.048 · 1 =

−0.048

I ∂e∂w3,1

= ∂e∂net3

· ∂net3∂w3,1

= ∂e∂net3

· ∂(w3,0+w3,1·a2+w3,2·x)

∂w3,1= ∂e

∂net3· a2 ≈

−0.048 · 0.73 = −0.0035

I ∂e∂w3,2

= ∂e∂net3· ∂net3∂w3,2

= ∂e∂net3· ∂(w3,0+w3,1·a2+w3,2·x)

∂w3,2= ∂e

∂net3·x ≈ −0.048 ·0 = 0

I ∂e∂w2,0

= ∂e∂net2· ∂net2∂w2,0

= ∂e∂net2· ∂(w2,0+w2,1·x)

∂w2,0= ∂e

∂net2·1 ≈ −0.0094·1 = −0.0094

I ∂e∂w2,1

= ∂e∂net2· ∂net2∂w2,1

= ∂e∂net2· ∂(w2,0+w2,1·x)

∂w2,1= ∂e

∂net2·x ≈ −0.0094·0 = −0.0094

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (21)

Page 26: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 3: Multi-Layer Perceptrons

Now, consider the network structure of a multi-layer perceptron with 10neurons given in the figure below. Each circle denotes a neuron, the arrowsdenote connections between neurons.

(c) Which of the neurons are input neurons, which ones are output neurons?

(d) How many layers does this MLP have? Which neurons belong to whichlayer?

A

B

C

D

E

F

G

H

I

J

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (22)

Page 27: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 3: Multi-Layer Perceptrons

Now, consider the network structure of a multi-layer perceptron with 10neurons given in the figure below. Each circle denotes a neuron, the arrowsdenote connections between neurons.

(c) Which of the neurons are input neurons, which ones are output neurons?

(d) How many layers does this MLP have? Which neurons belong to whichlayer?

A

B

C

D

E

F

G

H

I

J

input hidden hidden hidden output

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (23)

Page 28: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 3: Multi-Layer Perceptrons

(e) Assume we are applying a pattern to the MLP. Give an order in which theneuron activations ai can be calculated.

A

B

C

D

E

F

G

H

I

J

input hidden hidden hidden output

Possible Order: I,F,H,B,A,D,C,J,G,E

I determine input values (I, F)

I calculate activations in first hidden layer (H, B, A)

I calculate activations in second hidden layer (D, C)

I calculate activations in third hidden layer (J, G)

I calculate activations in output layer (E)

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (24)

Page 29: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 3: Multi-Layer Perceptrons

(f) Which of the functions given by the plots below can be implemented bymulti-layer perceptrons? The MLP should only contain neurons withlogistic activation functions. (Note: The weights of the networks must befinite numbers.)

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (25)

Page 30: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 3: Multi-Layer Perceptrons

I TL: yes (see next slide)

I TR: no (function must be differentiable)

I BL: no (since logistic function yields values between 0 and 1)

I BR: yes (a net with three hidden neurons and one output neuron can)

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (26)

Page 31: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 3: Multi-Layer Perceptrons

I TL: yes, the network below can

x y2

1

1

1

−1

5

2

2

0.5 −0.5

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (27)

Page 32: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 3: Multi-Layer Perceptrons

x y2

1

1

1

−1

5

2

2

0.5 −0.5

I neuron 1 (hidden layer)

0

0.2

0.4

0.6

0.8

1

-10 -5 0 5 10

I neuron 2 (hidden layer)

0

0.2

0.4

0.6

0.8

1

-10 -5 0 5 10

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (28)

Page 33: Machine Learning: Exercise Sheet 5 - uni-freiburg.deml.informatik.uni-freiburg.de/former/_media/teaching/ss...Machine Learning: Exercise Sheet 5 Manuel Blum AG Maschinelles Lernen

Exercise 3: Multi-Layer Perceptrons

I neuron 1 (hidden layer)

0

0.2

0.4

0.6

0.8

1

-10 -5 0 5 10

I neuron 2 (hidden layer)

0

0.2

0.4

0.6

0.8

1

-10 -5 0 5 10

Entire Net:

0

0.2

0.4

0.6

0.8

1

-10 -5 0 5 10

Manuel Blum Machine Learning Lab, University of Freiburg Machine Learning: Exercise Sheet 5 — Exercise 1: k-Nearest Neighbor Classifier (29)