Linear Classification Lecture 3 - Artificial Fei-Fei Li & Andrej Karpathy Lecture 2 - 20 7 Jan...

download Linear Classification Lecture 3 - Artificial Fei-Fei Li & Andrej Karpathy Lecture 2 - 20 7 Jan 2015

of 47

  • date post

    25-Aug-2020
  • Category

    Documents

  • view

    0
  • download

    0

Embed Size (px)

Transcript of Linear Classification Lecture 3 - Artificial Fei-Fei Li & Andrej Karpathy Lecture 2 - 20 7 Jan...

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 20151

    Lecture 3: Linear Classification

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 20152

    Last time: Image Classification

    cat

    assume given set of discrete labels {dog, cat, truck, plane, ...}

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 20153

    k-Nearest Neighbor training set

    test images

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 20154

    Linear Classification 1. define a score function

    class scores

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 20155

    Linear Classification 1. define a score function

    “weights” “bias vector”

    data (image)

    class scores “parameters”

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 20156

    Linear Classification 1. define a score function (assume CIFAR-10 example so 32 x 32 x 3 images, 10 classes)

    weights bias vector

    data (image) [3072 x 1]

    class scores

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 20157

    Linear Classification 1. define a score function (assume CIFAR-10 example so 32 x 32 x 3 images, 10 classes)

    weights [10 x 3072]

    bias vector [10 x 1]

    data (image) [3072 x 1]

    class scores [10 x 1]

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 20158

    Linear Classification

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 20159

    Interpreting a Linear Classifier Question: what can a linear classifier do?

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201510

    Interpreting a Linear Classifier

    Example training classifiers on CIFAR-10:

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201511

    Interpreting a Linear Classifier

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201512

    Bias trick

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201513

    So far: We defined a (linear) score function:

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201514

    2. Define a loss function (or cost function, or objective)

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201515

    2. Define a loss function (or cost function, or objective)

    - scores, label loss.

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201516

    2. Define a loss function (or cost function, or objective)

    - scores, label loss.

    Example:

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201517

    2. Define a loss function (or cost function, or objective)

    - scores, label loss.

    Example: Question: if you were to assign a single number to how “unhappy” you are with these scores, what would you do?

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201518

    2. Define a loss function (or cost function, or objective) One (of many ways) to do it: Multiclass SVM Loss

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201519

    2. Define a loss function (or cost function, or objective) One (of many ways) to do it: Multiclass SVM Loss

    (One possible generalization of Binary Support Vector Machine to multiple classes)

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201520

    2. Define a loss function (or cost function, or objective) One (of many ways) to do it: Multiclass SVM Loss

    loss due to example i sum over all

    incorrect labels difference between the correct class score and incorrect class score

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201521

    loss due to example i sum over all

    incorrect labels difference between the correct class score and incorrect class score

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201522

    Example:

    e.g. 10

    loss = ?

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201523

    Example:

    e.g. 10

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201524

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201525

    There is a bug with the objective…

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201526

    L2 Regularization

    Regularization strength

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201527

    L2 regularization: motivation

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201528

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201529

    Do we have to cross-validate both and ?

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201530

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201531

    So far… 1. Score function

    2. Loss function

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201532

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201533

    Softmax Classifier score functionis the same

    (extension of Logistic Regression to multiple classes)

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201534

    Softmax Classifier score functionis the same

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201535

    Softmax Classifier score functionis the same

    i.e. we’re minimizing the negative log likelihood.

    softmax function

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201536

    Softmax Classifier score functionis the same

    softmax function

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201537

    Softmax Classifier score functionis the same

    i.e. we’re minimizing the negative log likelihood.

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201538

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201539

    Softmax vs. SVM

    - Interpreting the probabilities from the Softmax

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201540

    Softmax vs. SVM

    - Interpreting the probabilities from the Softmax

    suppose the weights W were only half as large

    (we use a higher regularization strength)

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201541

    Softmax vs. SVM

    - Interpreting the probabilities from the Softmax

    suppose the weights W were only half as large:

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201542

    Softmax vs. SVM - Interpreting the probabilities from the Softmax

    suppose the weights W were only half as large:

    What happens in the limit, as the regularization strength goes to infinity?

  • Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 201543

    Softmax vs. SVM

    scores: [10, -2, 3] [10, 9, 9] [10,