Boosting and Additive Models - University of Kansasjhuan/EECS940_S12/slides/boosting.pdf · •...

41
Boosting and Additive Models Chapter 10 Elements of Statistical Learning

Transcript of Boosting and Additive Models - University of Kansasjhuan/EECS940_S12/slides/boosting.pdf · •...

Boosting and Additive Models

Chapter 10Elements of Statistical Learning

Outline

• Model Averaging– Bagging– Boosting

• Boosting: AdaBoost• Forward Stagewise Modeling• Interpretation of Boosting• Summary

Classification Problem

Classification Tree (CART)

Decision Boundary: CART

Comparison of Learning Methods

Is there a method that combines the advantage of SVM and CART?

Or, keep the advantages of CART while increasing its prediction power?

Model Averaging

Bagging (Bootstrap Aggregation)

• Bagging averages a given procedure over many samples to reduce the variance

Decision Boundary: Bagging

Bagging can dramatically reduce the variance of unstable procedures (like trees), leading to improved prediction.

Any simple structure in CART is lost

Decision Boundary: Bagging

Boosting

History of Boosting

Procedure of Boosting

Boosting vs. Bagging

AdaBoost (Freund & Schapire 1996)

AdaBoost

Boosting Stumps

Overfitting!

Forward Stagewise Modeling

Stagewise Least Square

Stagewise Lease Square

AdaBoost: Stagewise Modeling

Chapter 10.5 in Version 2

Why Exponential Loss

General Stagewise Algorithm

Boosting: avoid overfitting

Concluding Remarks

Recap: SVM

Recap: SVM

KKT conditions

SVM via Loss + Penalty

SVM = hinge loss + L2regularization

Logistic Regression

SVM vs. LL

Boosting via Loss + Penalty

Summary

Acknowledgement

• Dr Trevor Hastie’s slides for Chapter 10 in “Elements of Statistical Learning”

http://www-stat.stanford.edu/ hastie/TALKS/boost.pdfhttp://www-

stat.stanford.edu/ hastie/Papers/svmtalk.pdf

• “SVM tutorial” by Dr. C. Burges