Machine Learning with Discriminative Methods Lecture 02 PAC
Learning and tail bounds intro CS 790-134 Spring 2015 Alex
Berg
Slide 2
Todays lecture PAC Learning Tail bounds
Slide 3
Rectangle learning + - - - - - - + + + Hypothesis H Hypothesis
is any axis aligned rectangle. Inside rectangle is positive.
Slide 4
Rectangle learning Realizable case + - - - - - + + + Actual
boundary is also an axis-aligned rectangle, The Realizable Case (no
approximation error) - Hypothesis H
Slide 5
Rectangle learning Realizable case + - - - - - + + + Actual
boundary is also an axis-aligned rectangle, The Realizable Case (no
approximation error) - Hypothesis H - A mistake for the hypothesis
H! Measure ERROR by the probability of making a mistake.
Slide 6
Rectangle learning a strategy for a learning algorithm + - - -
- - + + + Hypothesis H (Output of learning algorithm so far) - Make
the smallest rectangle consistent with all the data so far.
Slide 7
Rectangle learning making a mistake + - - - - - + + +
Hypothesis H (Output of learning algorithm so far) - + Current
hypothesis makes a mistake on a new data item Make the smallest
rectangle consistent with all the data so far.
Slide 8
Rectangle learning making a mistake + - - - - - + + +
Hypothesis H (Output of learning algorithm so far) - Make the
smallest rectangle consistent with all the data so far. + Current
hypothesis makes a mistake on a new data item Use probability of
such a mistake (this is our error measure) to find a bound for how
likely it was we had not yet seen a training example in this
region
Slide 9
Very subtle formulation From the Kearns and Vazirani Reading R
= Result of algorithm so far (after m sample) R = Actual decision
boundary
Slide 10
From the Kearns and Vazirani Reading
Slide 11
PAC Learning
Slide 12
Flashback: Learning/fitting is a process From Raginsky notes
Estimating the probability that a tossed coin comes up heads The
ith coin toss Estimator based on n tosses Estimate is within
epsilon Estimate is not within epsilon Probability of being bad is
inversely proportional to the number of samples (the underlying
computation is an example of a tail bound)
Slide 13
Markovs Inequality From Raginksys notesnotes
Slide 14
Chebyshevs Inequality From Raginksys notesnotes
Slide 15
Not quite good enough From Raginksys notesnotes
Slide 16
For next class Read the wikipedia page for Chernoff Bound:
http://en.wikipedia.org/wiki/Chernoff_bound Read at least first
Raginskys introductory notes on tail bounds (pages 1-5)
http://maxim.ece.illinois.edu/teaching/fall14/notes/concentr
ation.pdf Come to class with questions! It is fine to have
questions, but first spend some time trying to work through
reading/problems. Feel free to post questions to Sakai discussion
board!