Computational Learning Theory PAC IID VC Dimension SVM
-
Upload
deirdre-davenport -
Category
Documents
-
view
20 -
download
0
description
Transcript of Computational Learning Theory PAC IID VC Dimension SVM
![Page 1: Computational Learning Theory PAC IID VC Dimension SVM](https://reader030.fdocuments.net/reader030/viewer/2022032805/5681325f550346895d98f3f9/html5/thumbnails/1.jpg)
Computational Learning Theory• PAC• IID• VC Dimension• SVM
Kunstmatige Intelligentie / RuG
Marius Bulacu
![Page 2: Computational Learning Theory PAC IID VC Dimension SVM](https://reader030.fdocuments.net/reader030/viewer/2022032805/5681325f550346895d98f3f9/html5/thumbnails/2.jpg)
2
The Problem
• Why does learning work?
• How do we know that the learned hypothesis h is close to the target function f if we do not know what f is?
answer provided by
computational learning theory
![Page 3: Computational Learning Theory PAC IID VC Dimension SVM](https://reader030.fdocuments.net/reader030/viewer/2022032805/5681325f550346895d98f3f9/html5/thumbnails/3.jpg)
3
The Answer
• Any hypothesis h that is consistent with a sufficiently large number of training examples is unlikely to be seriously wrong.
Therefore it must be:
Probably Approximately Correct
PAC
![Page 4: Computational Learning Theory PAC IID VC Dimension SVM](https://reader030.fdocuments.net/reader030/viewer/2022032805/5681325f550346895d98f3f9/html5/thumbnails/4.jpg)
4
The Stationarity Assumption
• The training and test sets are drawn randomly from the same population of examples using the same probability distribution.
Therefore training and test data are
Independently and Identically Distributed
IID
“the future is like the past”
![Page 5: Computational Learning Theory PAC IID VC Dimension SVM](https://reader030.fdocuments.net/reader030/viewer/2022032805/5681325f550346895d98f3f9/html5/thumbnails/5.jpg)
5
How many examples are needed?
Number of examples Probability that h and f disagree on an example
Probability of existence of a wrong hypothesis
consistent with all examples
)Hln(lnm 11
Size of hypothesis space
Sample complexity
![Page 6: Computational Learning Theory PAC IID VC Dimension SVM](https://reader030.fdocuments.net/reader030/viewer/2022032805/5681325f550346895d98f3f9/html5/thumbnails/6.jpg)
6
Formal Derivation
H (the set of all possible hypothese)
f
HBAD (the set of “wrong” hypotheses)
1))x(f)x(h,x(P
))x(f)x(h,x(P
)Hln(lnm)(H
)(H)Hh(P
m
mBADBAD
11
1
1
![Page 7: Computational Learning Theory PAC IID VC Dimension SVM](https://reader030.fdocuments.net/reader030/viewer/2022032805/5681325f550346895d98f3f9/html5/thumbnails/7.jpg)
7
What if hypothesis space is infinite?
Can’t use our result for finite H Need some other measure of complexity for H
– Vapnik-Chervonenkis dimension
![Page 8: Computational Learning Theory PAC IID VC Dimension SVM](https://reader030.fdocuments.net/reader030/viewer/2022032805/5681325f550346895d98f3f9/html5/thumbnails/8.jpg)
8
![Page 9: Computational Learning Theory PAC IID VC Dimension SVM](https://reader030.fdocuments.net/reader030/viewer/2022032805/5681325f550346895d98f3f9/html5/thumbnails/9.jpg)
9
![Page 10: Computational Learning Theory PAC IID VC Dimension SVM](https://reader030.fdocuments.net/reader030/viewer/2022032805/5681325f550346895d98f3f9/html5/thumbnails/10.jpg)
10
![Page 11: Computational Learning Theory PAC IID VC Dimension SVM](https://reader030.fdocuments.net/reader030/viewer/2022032805/5681325f550346895d98f3f9/html5/thumbnails/11.jpg)
11
![Page 12: Computational Learning Theory PAC IID VC Dimension SVM](https://reader030.fdocuments.net/reader030/viewer/2022032805/5681325f550346895d98f3f9/html5/thumbnails/12.jpg)
12
SVM (1): Kernels
Complicated separation boundary
Simple separation boundary: Hyperplane
f1
f2
f1
f2
f3
Kernels Polynomial Radial basis Sigmoid
Implicit mapping to a higher dimensional space where linear separation is possible.
![Page 13: Computational Learning Theory PAC IID VC Dimension SVM](https://reader030.fdocuments.net/reader030/viewer/2022032805/5681325f550346895d98f3f9/html5/thumbnails/13.jpg)
13
SVM (2): Max Margin
Support vectors
Max Margin
“Best” Separating Hyperplane
From all the possible separating hyperplanes, select the one that gives Max Margin.
Solution found by Quadratic Optimization – “Learning”.
f1
f2Good generalization