Neural Networks and Lecture 4: B Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 -...

download Neural Networks and Lecture 4: B Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 2019April

of 142

  • date post

    14-Sep-2019
  • Category

    Documents

  • view

    5
  • download

    0

Embed Size (px)

Transcript of Neural Networks and Lecture 4: B Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 -...

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 2019Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 20191

    Lecture 4: Neural Networks and

    Backpropagation

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 2019

    Administrative: Assignment 1

    Assignment 1 due Wednesday April 17, 11:59pm

    If using Google Cloud, you don’t need GPUs for this assignment!

    We will distribute Google Cloud coupons by this weekend

    2

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 2019

    Administrative: Alternate Midterm Time

    If you need to request an alternate midterm time:

    See Piazza for form, fill it out by 4/25 (two weeks from today)

    3

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 2019

    Administrative: Project Proposal

    Project proposal due 4/24

    4

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 2019

    Administrative: Discussion Section

    Discussion section tomorrow (1:20pm in Gates B03):

    How to pick a project / How to read a paper

    5

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 20196

    How to find the best W?

    Linear score function

    SVM loss (or softmax)

    data loss + regularization

    Where we are...

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 20197

    Finding the best W: Optimize with Gradient Descent

    Landscape image is CC0 1.0 public domain Walking man image is CC0 1.0 public domain

    http://maxpixel.freegreatpicture.com/Mountains-Valleys-Landscape-Hills-Grass-Green-699369 https://creativecommons.org/publicdomain/zero/1.0/ http://www.publicdomainpictures.net/view-image.php?image=139314&picture=walking-man https://creativecommons.org/publicdomain/zero/1.0/

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 20198

    Numerical gradient: slow :(, approximate :(, easy to write :) Analytic gradient: fast :), exact :), error-prone :(

    In practice: Derive analytic gradient, check your implementation with numerical gradient

    Gradient descent

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 2019

    Problem: Linear Classifiers are not very powerful

    9

    Visual Viewpoint

    Linear classifiers learn one template per class

    Geometric Viewpoint

    Linear classifiers can only draw linear decision boundaries

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 2019

    One Solution: Feature Transformation

    10

    f(x, y) = (r(x, y), θ(x, y))

    Transform data with a cleverly chosen feature transform f, then apply linear classifier

    Color Histogram Histogram of Oriented Gradients (HoG)

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 9, 2019

    Feature Extraction

    Image features vs ConvNets

    11

    f 10 numbers giving scores for classes

    training

    training

    10 numbers giving scores for classes

    Krizhevsky, Sutskever, and Hinton, “Imagenet classification with deep convolutional neural networks”, NIPS 2012. Figure copyright Krizhevsky, Sutskever, and Hinton, 2012. Reproduced with permission.

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 201912

    Today: Neural Networks

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 201913

    Neural networks: without the brain stuff

    (Before) Linear score function:

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 201914

    (Before) Linear score function:

    (Now) 2-layer Neural Network

    Neural networks: without the brain stuff

    (In practice we will usually add a learnable bias at each layer as well)

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 201915

    (Before) Linear score function:

    (Now) 2-layer Neural Network

    Neural networks: without the brain stuff

    (In practice we will usually add a learnable bias at each layer as well)

    “Neural Network” is a very broad term; these are more accurately called “fully-connected networks” or sometimes “multi-layer perceptrons” (MLP)

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 201916

    Neural networks: without the brain stuff

    (Before) Linear score function:

    (Now) 2-layer Neural Network or 3-layer Neural Network

    (In practice we will usually add a learnable bias at each layer as well)

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 201917

    (Before) Linear score function:

    (Now) 2-layer Neural Network

    Neural networks: without the brain stuff

    x hW1 sW2

    3072 100 10

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 201918

    (Before) Linear score function:

    (Now) 2-layer Neural Network

    Neural networks: without the brain stuff

    x hW1 sW2

    3072 100 10

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 2019

    The function is called the activation function. Q: What if we try to build a neural network without one?

    19

    (Before) Linear score function:

    (Now) 2-layer Neural Network

    Neural networks: without the brain stuff

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 2019

    The function is called the activation function. Q: What if we try to build a neural network without one?

    20

    (Before) Linear score function:

    (Now) 2-layer Neural Network

    Neural networks: without the brain stuff

    A: We end up with a linear classifier again!

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 201921

    Sigmoid

    tanh

    ReLU

    Leaky ReLU

    Maxout

    ELU

    Activation functions

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 201922

    Sigmoid

    tanh

    ReLU

    Leaky ReLU

    Maxout

    ELU

    Activation functions ReLU is a good default choice for most problems

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 201923

    “Fully-connected” layers “2-layer Neural Net”, or “1-hidden-layer Neural Net”

    “3-layer Neural Net”, or “2-hidden-layer Neural Net”

    Neural networks: Architectures

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 201924

    Example feed-forward computation of a neural network

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 201925

    Full implementation of training a 2-layer Neural Network needs ~20 lines:

  • Lecture 4 - 13 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 201626

    Setting the number of layers and their sizes

    more neurons = more capacity

  • Lecture 4 - 13 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - 13 Jan 201627

    (Web demo with ConvNetJS: http://cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html)

    Do not use size of neural network as a regularizer. Use stronger regularization instead:

    http://cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 201928

    This image by Fotis Bobolas is licensed under CC-BY 2.0

    https://www.flickr.com/photos/fbobolas/3822222947 https://www.flickr.com/photos/fbobolas https://creativecommons.org/licenses/by/2.0/

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 201929

    Impulses carried toward cell body

    Impulses carried away from cell body

    This image by Felipe Perucho is licensed under CC-BY 3.0

    dendrite

    cell body

    axon

    presynaptic terminal

    https://thenounproject.com/term/neuron/214105/ https://creativecommons.org/licenses/by/3.0/us/

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 201930

    Impulses carried toward cell body

    Impulses carried away from cell body

    This image by Felipe Perucho is licensed under CC-BY 3.0

    dendrite

    cell body

    axon

    presynaptic terminal

    https://thenounproject.com/term/neuron/214105/ https://creativecommons.org/licenses/by/3.0/us/

  • Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 4 - April 11, 201931

    sigmoid activation function

    Impulses carried toward cell body

    Impulses carried away from cell body

    This image by Felipe Perucho is licensed under CC-BY 3.0

    dendrite

    cell body

    axon

    presynaptic