Steep learning curves Reading: DH&S, Ch 4.6, 4.5.

Steep learning curves

Reading: DH&S, Ch 4.6, 4.5

Administrivia•HW1 due now

•Late days are ticking...

•No other news today..

Viewing and re-viewing•Last time:

•HW1 FAQ

•5 minutes of math: function optimization

•Measuring performance

•Cross-validation

•Today:

•Learning curves

•Metrics

•The nearest-neighbor rule

Exercise

•Given the function:

•Find the extremum

•Show that the extremum is really a minimum

Mea culpa!

•I copied the wrong example out of the book.

•Oops. My bad.

•You guys did a great job figuring it out, though...

The saddle point

Cross-validation in words•Shuffle data vectors

•Break into k chunks

•Train on first k-1 chunks

•Test on last 1

•Repeat, with a different chunk held-out

•Average all test accuracies together

CV in pix[X;y]Original

[X’;y’]Randomshuffle

k-waypartition

[X1’Y1’]

[X2’Y2’]

[Xk’Yk’]

k train/test sets

k accuracies53.7% 85.1% 73.2%

But is it really learning?•Now we know how well our models are performing

•But are they really learning?

•Maybe any classifier would do as well

•E.g., a default classifier (pick the most likely class) or a random classifier

•How can we tell if the model is learning anything?

The learning curve•Train on successively larger fractions of data

•Watch how accuracy (performance) changes Learning

Static classifier(no learning)

Anti-learning(forgetting)

Measuring variance•Cross validation helps you get better estimate of accuracy for small data

•Randomization (shuffling the data) helps guard against poor splits/ordering of the data

•Learning curves help assess learning rate/asymptotic accuracy

•Still one big missing component: variance

•Definition: Variance of a classifier is the fraction of error due to the specific data set it’s trained on

Measuring variance•Variance tells you how much you expect your classifier/performance to change when you train it on a new (but similar) data set

•E.g., take 5 samplings of a data source; train/test 5 classifiers

•Accuracies: 74.2, 90.3, 58.1, 80.6, 90.3

•Mean accuracy: 78.7%

•Std dev of acc: 13.4%

•Variance is usually a function of both classifier and data source

•High variance classifiers are very susceptible to small changes in data

Putting it all together•Suppose you want to measure the expected accuracy of your classifier, assess learning rate, and measure variance all at the same time?for (i=0;i<10;++i) { // variance reps

shuffle datado 10-way CV partition of datafor each train/test partition { // xval

for (pct=0.1;pct+=0.1;pct<=0.9) { // LCSubsample pct fraction of training settrain on subsample, test on test set

}}avg across all folds of CV partitiongenerate learning curve for this partition

}get mean and std across all curves

Putting it all together“hepatitis” data

5 minutes of math...

•Decision trees are non-metric

•Don’t know anything about relations between instances, except sets induced by feature splits

•Often, we have well-defined distances between points

•Idea of distance encapsulated by a metric

5 minutes of math...•Definition: a metric function

•is a function that obeys the following properties:

•Non-negativity:

•Reflexivity:

•Symmetry:

4.Triangle inequality:

5 minutes of math...•Euclidean distance

5 minutes of math

dE(xa,xb)

5 minutes of math...•Manhattan (taxicab) distance

•Distance travelled along a grid between two points

•No diagonals allowed

•Good for integer features

5 minutes of math

dM(xa,xb)

5 minutes of math...•What if some attribute is categorical?

•Typical answer is Hamming (sometimes 0/1) distance:

•For each attribute, add 1 if the instances differ in that attribute, else 0

Distances in classification•Nearest neighbor rule: find the nearest instance to the query point in feature space, return the class of that instance

•Simplest possible distance-based classifier

•With more notation:

Distances in classification•Nearest neighbor rule: find the nearest instance to the query point in feature space, return the class of that instance

•Simplest possible distance-based classifier

•With more notation:

•Distance here is “whatever’s appropriate to your data”

Properties of NN•Training time of NN?

•Classification time?

•Geometry of model?

d( , )

Closer to

NN miscellaney

•Slight generalization: k-Nearest neighbors (k-NN)

•Find k training instances closest to query point

•Vote among them for label

•Q: How does this affect system?

Steep learning curves Reading: DH&S, Ch 4.6, 4.5.

Documents

Transcript of Steep learning curves Reading: DH&S, Ch 4.6, 4.5.

Dh 1 Million Dh 500,000 Dh 250,000 Dh 125,000 Dh 64,000 Dh 32,000 Dh 16,000 Dh 8,000 Dh 4,000 Dh 2,000 Dh 1,000 Dh 500 Dh 400 Dh 300 Dh 200 Dh 100.

Steep Slopes1

Culture The Steep-slope Enjoyment - durbacher.de · –The Steep-slope specialists Durbach’s topography means that wine is grown almost exclusively on steep- slopes with gradients

STEEP Analysis

Flex 4.6 et Flash Builder 4.6

Resoconto riunione 2010 09 28 - Apmarr · versione del 28 Marzo 2017 190203ASP Caltanissetta DO DH DO+DH DO DH DO+DH DO DH DO+DH DO DH DO+DH DO DH DO+DH DO DH DO+DH C S SD C S SD

The Steep Learning Curve

Demag hoist units – Volume 1...Demag hoist units – Volume 1 DH 160, DH 200, DH 300, DH 400, DH 500, DH 600, DH 1000, DH 2000 140703 EN/DE 203 341 44 714 IS 813 Demag hoist units

DH 24PH2 DH 26PC2 DH 28PBY2 DH 28PCY2 DH 28PMY2...DH 24PG2 ∙ DH 24PH2DH 26PB2 ∙ DH 26PC2DH 28PBY2 ∙ DH 28PCY2 ∙ DH 28PMY2DH28PCY2 Handling instructions Iнструкції

NEW OWNERSHIP...dh bsl dh bsl dh bsl dh bsl dh bsl dh bsl gl gl dh gl gl dh bsl dh bsl dh dh dh bsl gl gl dh bsl dh bsl dh bsl dh bsl 15,800warehous e sq. ft. 3,340 new tenant sq.

Steep Hill Study Provides Evidence of Substantial Pesticide Contamination … 101617 Item 9 Steep... · 2017-10-23 · Steep Hill Study Provides Evidence of Substantial Pesticide

Fruit Removal from Steep Hillside Avocado Orchards ... · PDF fileFRUIT REMOVAL FROM STEEP HILLSIDE AVOCADO ORCHARDS ... Photos of Donkey operation—Courtesy of Dick ... from Steep

Dh 1 Million Dh 500,000 Dh 250,000 Dh 125,000 Dh 64,000 Dh 32,000 Dh 16,000 Dh 8,000 Dh 4,000

The chiki steep!

DOSCH DESIGN · DH-nag-oa-Bbos)ba DH-neg-og-Bb40flba DH-neg-og-Bbor)ba DH-neg-oe-Bb0fflba . DH-neg-u-asos.)ba DH-neg-u DH-neg-ao-Bbog.)ba DH-neg-w)ba DH-neg-oa-esoa.)ba DH-neg-asflba

SLOPE How steep is the hill?. TABLE OF CONTENTS Uphill - Steep Uphill – Not Steep Downhill - Steep Downhill – Not steepCredits.

STEEP TRANSISTORS WORKSHOP 2016 - E2SWITCH STW2016 FiG.pdf · STEEP TRANSISTORS WORKSHOP 2016, ESSDERC/EPFL, Lausanne, Switzerland SUNDAY SEPTEMBER 11, 2016 OPENING STEEP TRANSISTOR

Steep slope roofing

Steep Corner Fire Fatality - Public Interactivemediad.publicbroadcasting.net/p/northwestnews/files/...1/2/2013 Steep Corner SAI Report ‐ i Serious Accident Investigation Report Steep

TRAIL DIFFICULTY RATING SYSTEM – MOUNTAIN BIKE€¦ · DH or trails Extremely difficult trails will incorporate very steep gradients, highly variable surface and unavoidable, severe