Lecture 2: Image Classification pipeline - Artificial...

Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 1

Lecture 2: Image Classification pipeline

Image Classification: a core task in Computer Vision

(assume given set of discrete labels) {dog, cat, truck, plane, ...}

The problem: semantic gap

Images are represented as Rd arrays of numbers •  E.g. R3 with integers

between [0, 255], where d=3 represents 3 color channels (RGB)

An image classifier

Unlike e.g. sorting a list of numbers, no obvious way to hard-code the algorithm for recognizing a cat, or other classes.

Data-driven approach: 1.  Collect a dataset of images and label them 2.  Use Machine Learning to train an image classifier 3.  Evaluate the classifier on a withheld set of test images

Example training set

First classifier: Nearest Neighbor Classifier

Remember all training images and their labels

Predict the label of the most similar training image

Example dataset: CIFAR-10 10 labels 50,000 training images 10,000 test images.

For every test image (first column), examples of nearest neighbors in rows

How do we compare the images? What is the distance metric?

L1 distance: Where I1 denotes image 1, and p denotes each pixel

Nearest Neighbor classifier

remember the training data

for every test image: -  find nearest train image

with L1 distance -  predict the label of

nearest training image

Q: what is the complexity of the NN classifier w.r.t training set of N images and test set of M images? 1.  at training time?

2.  at test time?

Q: what is the complexity of the NN classifier w.r.t training set of N images and test set of M images? 1.  at training time?

O(1) 1.  at test time?

1.  at training time? O(1) 2. at test time? O(NM) This is backwards: - test time performance is usually much more important. - CNNs flip this: expensive training, cheap test evaluation

Aside: Approximate Nearest Neighbor find approximate nearest neighbors quickly

The choice of distance is a hyperparameter

L1 (Manhattan) distance L2 (Euclidean) distance

-  Two most commonly used special cases of p-norm

k-Nearest Neighbor find the k nearest images, have them vote on the label

the data NN classifier 5-NN classifier

http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm

What is the best distance to use? What is the best value of k to use? i.e. how do we set the hyperparameters?

What is the best distance to use? What is the best value of k to use? i.e. how do we set the hyperparameters? Very problem-dependent. Must try them all out and see what works best.

Trying out what hyperparameters work best on test set: Very bad idea. The test set is a proxy for the generalization performance

Validation data use to tune hyperparameters evaluate on test set ONCE at the end

Cross-validation cycle through the choice of which fold is the validation fold, average results.

Example of 5-fold cross-validation for the value of k. Each point: single outcome. The line goes through the mean, bars indicated standard deviation (Seems that k = 7 works best for this data)

Summary

-  Image Classification: We are given a Training Set of labeled images, asked to predict labels on Test Set. Common to report the Accuracy of predictions (fraction of correctly predicted images)

-  We introduced the k-Nearest Neighbor Classifier, which predicts the labels based on nearest images in the training set

-  We saw that the choice of distance and the value of k are hyperparameters that are tuned using a validation set, or through cross-validation if the size of the data is small.

-  Once the best set of hyperparameters is chosen, the classifier is evaluated once on the test set.

Lecture 2: Image Classification pipeline - Artificial...

Documents

Transcript of Lecture 2: Image Classification pipeline - Artificial...

Lecture 12 - Stanford Universityvision.stanford.edu/.../cs231n/slides/2016/winter1516_lecture12.pdf · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 12 ... Fei-Fei Li & Andrej

Recurrent Neural Networks - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2016/... · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 38 8 Feb 2016

Lecture 8: Camera Calibration - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1112/... · Fei-Fei Li Lecture 8 - 21 19-Oct-11 Affine cameras • Weak perspective

Lecture’8:’’ CameraCalibraon’ - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1213/lecture/...Lecture 8 - !!! Fei-Fei Li! Whatwe’will’learn’today?’

Linear Classification Lecture 3 - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture3.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 20 7 Jan 2015 2.

ML with Tensorflow-new - GitHub Pages · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 -68 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Full (simplified) AlexNet

Lecture 7: Convolutional Neural Networksvision.stanford.edu/.../2016/winter1516_lecture7.pdf · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 50 27 Jan 2016 The brain/neuron

Lecture 27: Recognition Basics CS4670/5670: Computer Vision Kavita Bala Slides from Andrej Karpathy and Fei-Fei Li

Lecture 9: EpipolarGeometry - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1112/...Fei-Fei Li Lecture 9 - 9 24-Oct-11 Stereo-view geometry • Correspondence: Given

Lecture 1: Introduction - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2016/... · scene understanding • CS231n (this term, Prof. Fei-Fei Li & Andrej Karpathy

Lecture 7: Camera Models - Artificial Intelligencevision.stanford.edu/teaching/cs231a_autumn1112/... · Fei-Fei Li Lecture 7 - Cameras & Lenses • Laws of geometric optics – Light

Convolutional Neural Networks - Alan Ritteraritter.github.io/courses/5523_slides/cnn.pdfConvolutional Neural Networks (First without the brain stuff) Fei-Fei Li & Andrej Karpathy &

Object Discovery in 3D scenes via Shape Analysis · Object Discovery in 3D scenes via Shape Analysis Andrej Karpathy, Stephen Miller and Li Fei-Fei Abstract—We present a method

Image Classification pipeline Lecture 2 - Stanford Universitycs231n.stanford.edu/slides/2016/winter1516_lecture2.pdf · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 2 - 2

Lecture 6: Training Neural Networks, Part II · * Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl 1 Tuesday February 7, 2017 Lecture 6: Training

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 ...vision.stanford.edu/teaching/cs231n/slides/2016/... · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - Lecture

FOIL it! Find One mismatch between Image and Language ......by using ‘neuraltalk’ (Karpathy and Fei-Fei, 2015) loss 11 Andrej Karpathy and Fei-Fei Li “Deep Visual-Semantic Alignments

Deep Visual-Semantic Alignments for Generating Image ... · Deep Visual-Semantic Alignments for Generating Image Descriptions Andrej Karpathy Li Fei-Fei Department of Computer Science,

Lecture’12:’Clustering’and’ Segmentaon’ - Artificial Intelligencevision.stanford.edu/teaching/cs131_fall1415/lectures/... · 2014-10-28 · Lecture 12 - !!! Fei-Fei Li!

Lecture 11 - Stanford Universitycs231n.stanford.edu/slides/2016/winter1516_lecture11.pdf · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 11 - 47 17 Feb 2016 The power of