Machine Learning with Discriminative Methods Lecture 02 – PAC Learning and tail bounds intro CS...

Machine Learning with Discriminative Methods Lecture 02 PAC Learning and tail bounds intro CS 790-134 Spring 2015 Alex Berg

Todays lecture PAC Learning Tail bounds

Rectangle learning + - - - - - - + + + Hypothesis H Hypothesis is any axis aligned rectangle. Inside rectangle is positive.

Rectangle learning Realizable case + - - - - - + + + Actual boundary is also an axis-aligned rectangle, The Realizable Case (no approximation error) - Hypothesis H

Rectangle learning Realizable case + - - - - - + + + Actual boundary is also an axis-aligned rectangle, The Realizable Case (no approximation error) - Hypothesis H - A mistake for the hypothesis H! Measure ERROR by the probability of making a mistake.

Rectangle learning a strategy for a learning algorithm + - - - - - + + + Hypothesis H (Output of learning algorithm so far) - Make the smallest rectangle consistent with all the data so far.

Rectangle learning making a mistake + - - - - - + + + Hypothesis H (Output of learning algorithm so far) - + Current hypothesis makes a mistake on a new data item Make the smallest rectangle consistent with all the data so far.

Rectangle learning making a mistake + - - - - - + + + Hypothesis H (Output of learning algorithm so far) - Make the smallest rectangle consistent with all the data so far. + Current hypothesis makes a mistake on a new data item Use probability of such a mistake (this is our error measure) to find a bound for how likely it was we had not yet seen a training example in this region

Very subtle formulation From the Kearns and Vazirani Reading R = Result of algorithm so far (after m sample) R = Actual decision boundary

From the Kearns and Vazirani Reading

PAC Learning

Flashback: Learning/fitting is a process From Raginsky notes Estimating the probability that a tossed coin comes up heads The ith coin toss Estimator based on n tosses Estimate is within epsilon Estimate is not within epsilon Probability of being bad is inversely proportional to the number of samples (the underlying computation is an example of a tail bound)

Markovs Inequality From Raginksys notesnotes

Chebyshevs Inequality From Raginksys notesnotes

Not quite good enough From Raginksys notesnotes

For next class Read the wikipedia page for Chernoff Bound: http://en.wikipedia.org/wiki/Chernoff_bound Read at least first Raginskys introductory notes on tail bounds (pages 1-5) http://maxim.ece.illinois.edu/teaching/fall14/notes/concentr ation.pdf Come to class with questions! It is fine to have questions, but first spend some time trying to work through reading/problems. Feel free to post questions to Sakai discussion board!

Machine Learning with Discriminative Methods Lecture 02 – PAC Learning and tail bounds intro CS...

Documents

Transcript of Machine Learning with Discriminative Methods Lecture 02 – PAC Learning and tail bounds intro CS...