PAC Learning adapted from Tom M.Mitchell Carnegie Mellon University.

Post on 17-Dec-2015

217 views 3 download

Transcript of PAC Learning adapted from Tom M.Mitchell Carnegie Mellon University.

PAC Learning

adapted from

Tom M.Mitchell

Carnegie Mellon University

Learning Issues

Under what conditions is successful learning

… possible ?

… assured for a particular learning algorithm ?

Sample Complexity

How many training examples are needed

… for a learner to converge (with high probability) to a successful hypothesis?

Computational Complexity

How much computational effort is needed

… for a learner to converge (with high probability) to a successful hypothesis?

The world

X is the sample space

Example: Two dice{(1,1),(1,2),…,(6,5),(6,6)}

x x xx

x x x

xx

x

xx

x

Weighted world

X is a distribution over X

Example: Biased dice{(1,1; p11),(1,2 ; p12),…,(6,5 ; p65),(6,6 ;

p66)}

xx xx

x xx

xx

x

xx

x

An event

E is a subset of X

Example: Two dice{(1,1),(1,2),…,(6,5),(6,6)}

x x xx

x x x

xx

x

xx

x

An event

E is a subset of X

Example: A pair in Two dice{(1,1),(2,2),(3,3),(4,4),(5,5),(6,6)}

x x xx

x x x

xx

x

xx

x

A Concept

C is an indicator function of an event E

Example: A pair in Two dicec(x,y) := (x==y)x x xx

x x x

xx

x

xx

x

A hypotesis

h is an approximation to a concept c

Example: A separating hyperplane

h(x,y) := (0.5).[1+sign(a.x+by+c)]

x x xx

x x x

xx

x

xx

x

The dataset

D is an i.i.d. sample from (X, )

{<xi,c(xi)>}i=1,…,m

m examples

An Inductive learner

L is an algorithm that uses data D to produce hH

Example: The Perceptron Algorithm

h(x,y) := (0.5).[1+sign(a(D).x+b(D).y+c(D))]

x x xx

x x x

xx

x

xx

x

Error Measures

Training error of hypothesis h

How often over training instances

True error of hypothesis h

How often over future random instances

True error

True error

Learnability

How to describe Learn-ability ?

the number of training examples needed to learn a hypothesis for

which = 0.

Infeasible Infeasible

PAC Learnability

Weaken demands on the learner

true error accuracy failure probability

and can be arbitrarily small

Probably Approximately Correct Probably Approximately Correct LearningLearning

PAC Learnability

C is PAC-learnable by L

true error < with probability (1-) after reasonable # of examples reasonable time per example

Reasonable polynomial in terms of 1/, 1/, n(size of

examples) and target concept encoding length

PAC Learnability

1)(Pr herrorD

C is PAC-Learnable

each target concept in C can be learned from a polynominal number of training examples

the processing time per example is also polynominal bounded

polynomial in terms of 1/, 1/, n (size of examples) and target c encoding length