Download - Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Transcript
Page 1: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 1

Lecture 2: Image Classification pipeline

Page 2: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 2

Image Classification: a core task in Computer Vision

cat

(assume given set of discrete labels) {dog, cat, truck, plane, ...}

Page 3: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 3

The problem: semantic gap

Images are represented as Rd arrays of numbers •  E.g. R3 with integers

between [0, 255], where d=3 represents 3 color channels (RGB)

Page 4: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 4

Page 5: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 5

Page 6: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 6

Page 7: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 7

Page 8: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 8

Page 9: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 9

Page 10: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 10

Page 11: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 11

An image classifier

Unlike e.g. sorting a list of numbers, no obvious way to hard-code the algorithm for recognizing a cat, or other classes.

Page 12: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 12

Data-driven approach: 1.  Collect a dataset of images and label them 2.  Use Machine Learning to train an image classifier 3.  Evaluate the classifier on a withheld set of test images

Example training set

Page 13: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 13

First classifier: Nearest Neighbor Classifier

Remember all training images and their labels

Predict the label of the most similar training image

Page 14: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 14

Example dataset: CIFAR-10 10 labels 50,000 training images 10,000 test images.

Page 15: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 15

Example dataset: CIFAR-10 10 labels 50,000 training images 10,000 test images.

For every test image (first column), examples of nearest neighbors in rows

Page 16: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 16

How do we compare the images? What is the distance metric?

L1 distance: Where I1 denotes image 1, and p denotes each pixel

Page 17: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 17

Nearest Neighbor classifier

Page 18: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 18

Nearest Neighbor classifier

remember the training data

Page 19: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 19

Nearest Neighbor classifier

for every test image: -  find nearest train image

with L1 distance -  predict the label of

nearest training image

Page 20: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 20

Nearest Neighbor classifier

Q: what is the complexity of the NN classifier w.r.t training set of N images and test set of M images? 1.  at training time?

2.  at test time?

Page 21: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 21

Nearest Neighbor classifier

Q: what is the complexity of the NN classifier w.r.t training set of N images and test set of M images? 1.  at training time?

O(1) 1.  at test time?

O(NM)

Page 22: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 22

Nearest Neighbor classifier

1.  at training time? O(1) 2. at test time? O(NM) This is backwards: - test time performance is usually much more important. - CNNs flip this: expensive training, cheap test evaluation

Page 23: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 23

Aside: Approximate Nearest Neighbor find approximate nearest neighbors quickly

Page 24: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 24

The choice of distance is a hyperparameter

L1 (Manhattan) distance L2 (Euclidean) distance

-  Two most commonly used special cases of p-norm

Page 25: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 25

k-Nearest Neighbor find the k nearest images, have them vote on the label

the data NN classifier 5-NN classifier

http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm

Page 26: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 26

What is the best distance to use? What is the best value of k to use? i.e. how do we set the hyperparameters?

Page 27: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 27

What is the best distance to use? What is the best value of k to use? i.e. how do we set the hyperparameters? Very problem-dependent. Must try them all out and see what works best.

Page 28: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 28

Trying out what hyperparameters work best on test set: Very bad idea. The test set is a proxy for the generalization performance

Page 29: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 29

Validation data use to tune hyperparameters evaluate on test set ONCE at the end

Page 30: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 30

Cross-validation cycle through the choice of which fold is the validation fold, average results.

Page 31: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 31

Example of 5-fold cross-validation for the value of k. Each point: single outcome. The line goes through the mean, bars indicated standard deviation (Seems that k = 7 works best for this data)

Page 32: Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 32

Summary

-  Image Classification: We are given a Training Set of labeled images, asked to predict labels on Test Set. Common to report the Accuracy of predictions (fraction of correctly predicted images)

-  We introduced the k-Nearest Neighbor Classifier, which predicts the labels based on nearest images in the training set

-  We saw that the choice of distance and the value of k are hyperparameters that are tuned using a validation set, or through cross-validation if the size of the data is small.

-  Once the best set of hyperparameters is chosen, the classifier is evaluated once on the test set.