Lecture 29: Optimization and Neural Nets CS4670/5670: Computer Vision Kavita Bala Slides from Andrej...

Lecture 29: Optimization and Neural Nets

CS4670/5670: Computer VisionKavita Bala

Slides from Andrej Karpathy and Fei-Fei Lihttp://vision.stanford.edu/teaching/cs231n/

• Optimization• Today and Monday: Neural nets, CNNs

– Mon: http://cs231n.github.io/classification/– Wed: http://cs231n.github.io/linear-classify/– Today:

• http://cs231n.github.io/optimization-1/• http://cs231n.github.io/optimization-2/

Summary

Other loss functions

• Scores are not very intuitive

• Softmax classifier– Score function is same– Intuitive output: normalized class probabilities– Extension of logistic regression to multiple classes

Softmax classifier

Interpretation: squashes values into range 0 to 1

Cross-entropy loss

Aside: Loss function interpretation

• Probability– Maximum Likelihood Estimation (MLE)– Regularization is Maximum a posteriori (MAP)

estimation

• Cross-entropy H– p is true distribution (1 for the correct class), q is

estimated – Softmax classifier minimizes cross-entropy– Minimizes the KL divergence (Kullback-Leibler)

between the distribution: distance between p and q

SVM vs. Softmax

Summary• Have score function and loss function

– Will generalize the score function• Find W and b to minimize loss

– SVM vs. Softmax• Comparable in performance• SVM satisfies margins, softmax optimizes probabilities

Gradient Descent

Step size: learning rateToo big: will miss the minimumToo small: slow convergence

Analytic Gradient

Gradient Descent

Mini-batch Gradient Descent

Stochastic Gradient Descent (SGD)

Summary

Where are we?

• Classifiers: SVM vs. Softmax• Gradient descent to optimize loss functions

– Batch gradient descent, stochastic gradient descent

– Momentum– Numerical gradients (slow, approximate), analytic

gradients (fast, error-prone)

Derivatives

• Given f(x), where x is vector of inputs– Compute gradient of f at x:

Examples

Backprop gradients

• Can create complex stages, but easily compute gradient

Lecture 29: Optimization and Neural Nets CS4670/5670: Computer Vision Kavita Bala Slides from Andrej...

Documents

Transcript of Lecture 29: Optimization and Neural Nets CS4670/5670: Computer Vision Kavita Bala Slides from Andrej...

Abstract arXiv:1612.08354v1 [cs.CV] 26 Dec 20162016), object tagging (Karpathy & Fei-Fei,2015), text to image search (Wang et al.,2016), and so on. For all these works, how to achieve

CS60010: Deep Learningsudeshna/courses/DL18/... · Deep Visual -Semantic Alignments for Generating Image Descriptions, Karpathy and Fei -Fei Show and Tell: A Neural Image Caption

Lecture 27: Recognition Basics CS4670/5670: Computer Vision Kavita Bala Slides from Andrej Karpathy and Fei-Fei Li

Case Study: LeNet-5€¦ · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 -60 27 Jan 2016 Case Study: LeNet-5 [LeCun et al., 1998] Conv filters were 5x5, applied at stride

Lecture 12 - Stanford Universityvision.stanford.edu/.../cs231n/slides/2016/winter1516_lecture12.pdf · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 12 ... Fei-Fei Li & Andrej

Lecture 4: Backpropagation and Neural Networks part 1vision.stanford.edu/teaching/cs231n/slides/2016/winter1516_lecture… · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture

Lecture 6: Training Neural Networks, Part II · * Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n comp150dl 1 Tuesday February 7, 2017 Lecture 6: Training

Lecture 5: Training Neural Networks, Part I - GitHub Pages · Widrow and Hoff, ~1960: Adaline/Madaline * Original slides borrowed from Andrej Karpathy and Li Fei-Fei, Stanford cs231n

DenseCap: Fully Convolutional ... - cv-foundation.org · DenseCap: Fully Convolutional Localization Networks for Dense Captioning Justin Johnson∗ Andrej Karpathy∗ Li Fei-Fei Department

Image Classification pipeline Lecture 2 - Stanford Universitycs231n.stanford.edu/slides/2016/winter1516_lecture2.pdf · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 2 - 2

Lecture 11 - Stanford Universitycs231n.stanford.edu/slides/2016/winter1516_lecture11.pdf · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 11 - 47 17 Feb 2016 The power of

ML with Tensorflow-new - GitHub Pages › ml › lec12.pdf · 2017-10-02 · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 -16 8 Feb 2016 Recurrent Neural Network x RNN

Fei-Fei Li & Andrej Karpathy & Justin Johnson …cs231n.stanford.edu/slides/2016/winter1516_lecture14.pdfFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 14 - 7 29 Feb 2016 Dense

Lecture 2: Image Classification pipeline - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture2.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 32 7 Jan

ML with Tensorflow-new - GitHub Pageshunkim.github.io/ml/lec12.pdf · e.g. Machine Translation seq of words -> seq of words. Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture

DenseCap: Fully Convolutional Localization Networks for ... · DenseCap: Fully Convolutional Localization Networks for Dense Captioning Justin Johnson Andrej Karpathy Li Fei-Fei Department

Linear Classification Lecture 3 - Artificial Intelligencevision.stanford.edu/teaching/cs231n/slides/2015/lecture3.pdf · Fei-Fei Li & Andrej Karpathy Lecture 2 - 20 7 Jan 2015 2.

Lecture 7: Convolutional Neural Networksvision.stanford.edu/.../2016/winter1516_lecture7.pdf · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 50 27 Jan 2016 The brain/neuron

ML with Tensorflow-new - GitHub Pageshunkim.github.io/ml/lec11.pdf · 2017-10-02 · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 -68 27 Jan 2016 Case Study: AlexNet [Krizhevsky

Lecture 12 - Stanford University CS231n: Convolutional ...cs231n.stanford.edu/slides/2016/winter1516_lecture12.pdf · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 12 - 22