Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1....

90
Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1

Transcript of Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1....

Page 1: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Machine Learning Basics

I2DL: Prof. Niessner, Prof. Leal-Taixé 1

Page 2: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Machine Learning

Task

I2DL: Prof. Niessner, Prof. Leal-Taixé 2

Page 3: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Image Classification

I2DL: Prof. Niessner, Prof. Leal-Taixé 3

Page 4: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

I2DL: Prof. Niessner, Prof. Leal-Taixé 4Pose AppearanceIllumination

Page 5: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Image Classification

Occlusions

I2DL: Prof. Niessner, Prof. Leal-Taixé 5

Page 6: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Image Classification

I2DL: Prof. Niessner, Prof. Leal-Taixé 6

Background clutter

Page 7: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Representation

Image Classification

I2DL: Prof. Niessner, Prof. Leal-Taixé 7

Page 8: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Task

Image classification

Experience

Data

Machine Learning• How can we learn to perform image classification?

I2DL: Prof. Niessner, Prof. Leal-Taixé 8

Page 9: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Supervised learning

Machine Learning

• No label or target class

• Find out properties of the structure of the data

• Clustering (k-means, PCA, etc.)

I2DL: Prof. Niessner, Prof. Leal-Taixé 9

Unsupervised learning

Page 10: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Machine Learning

Unsupervised learning Supervised learning

I2DL: Prof. Niessner, Prof. Leal-Taixé 10

Page 11: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Machine Learning

• Labels or target classes

Unsupervised learning Supervised learning

I2DL: Prof. Niessner, Prof. Leal-Taixé 11

Page 12: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

DOG DOG

DOG

CAT

CAT

CAT

Machine Learning

Unsupervised learning Supervised learning

I2DL: Prof. Niessner, Prof. Leal-Taixé 12

Page 13: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Experience

DataTraining dataTest data

Underlying assumption that train and test data come from the same distribution

Machine Learning• How can we learn to perform image classification?

I2DL: Prof. Niessner, Prof. Leal-Taixé 13

Page 14: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Task

Image classification

Experience

DataPerformance

measure

Accuracy

Machine Learning

I2DL: Prof. Niessner, Prof. Leal-Taixé 14

• How can we learn to perform image classification?

Page 15: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Reinforcement learning

Agents Environmentinteraction

Machine Learning

Unsupervised learning Supervised learning

I2DL: Prof. Niessner, Prof. Leal-Taixé 15

Page 16: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Reinforcement learning

Agents Environmentreward

Machine Learning

Unsupervised learning Supervised learning

I2DL: Prof. Niessner, Prof. Leal-Taixé 16

Page 17: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Reinforcement learning

Agents Environmentreward

Machine Learning

Unsupervised learning Supervised learning

I2DL: Prof. Niessner, Prof. Leal-Taixé 17

Page 18: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

A Simple Classifier

I2DL: Prof. Niessner, Prof. Leal-Taixé 18

Page 19: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Nearest Neighbor

?I2DL: Prof. Niessner, Prof. Leal-Taixé 19

Page 20: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Nearest Neighbor

distance

NN classifier = dog

I2DL: Prof. Niessner, Prof. Leal-Taixé 20

Page 21: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Nearest Neighbor

distance

k-NN classifier = cat

I2DL: Prof. Niessner, Prof. Leal-Taixé 21

Page 22: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Nearest Neighbor

Source: https://commons.wikimedia.org/wiki/File:Data3classes.png

How does the NN classifier perform on training data?

What classifier is more likely to perform best on test data?

I2DL: Prof. Niessner, Prof. Leal-Taixé 22

NN Classifier 5NN ClassifierThe Data

Page 23: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Nearest Neighbor

• Hyperparameters

• These parameters are problem dependent.

• How do we choose these hyperparameters?

L2 distance : ||𝑥 − 𝑐||2

L1 distance : |𝑥 − 𝑐|

No. of Neighbors: 𝑘

I2DL: Prof. Niessner, Prof. Leal-Taixé 23

Page 24: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Basic Recipe for Machine Learning• Split your data

I2DL: Prof. Niessner, Prof. Leal-Taixé 24

Find your hyperparameters

train testvalidation

20%60% 20%

Other splits are also possible (e.g., 80%/10%/10%)

Page 25: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Basic Recipe for Machine Learning• Split your data

I2DL: Prof. Niessner, Prof. Leal-Taixé 25

Find your hyperparameters

train testvalidation

20%60% 20%

Test set is only used once!

Page 26: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Cross Validationtrain

validationRun 1

Run 2

Run 3

Run 4

Run 5

Split the training data into N foldsI2DL: Prof. Niessner, Prof. Leal-Taixé 26

Page 27: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Cross Validation

Find your hyperparameters

train testvalidation

20%60% 20%

Why do cross validation?

Why not just train and test?

I2DL: Prof. Niessner, Prof. Leal-Taixé 27

Page 28: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Find your hyperparameters

train testvalidation

20%60% 20%

Why do cross validation? Why not just train and

test?

Cross Validation

Test set is only used once!

I2DL: Prof. Niessner, Prof. Leal-Taixé 28

Page 29: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Linear Decision Boundaries

This lecture

What are the pros and cons for using

linear decision boundaries?

I2DL: Prof. Niessner, Prof. Leal-Taixé 32

Page 30: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Linear Regression

I2DL: Prof. Niessner, Prof. Leal-Taixé 33

Page 31: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Linear Regression

• Supervised learning

• Find a linear model that explains a target 𝒚 given inputs 𝒙

𝒙

𝒚

I2DL: Prof. Niessner, Prof. Leal-Taixé 34

Page 32: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Training

Model parameters

Linear Regression

{𝒙1:𝑛, 𝒚1:𝑛}

Data points𝜽

Input (e.g., image,

measurement) Labels (e.g., cat/dog)

I2DL: Prof. Niessner, Prof. Leal-Taixé 35

Learner

Page 33: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Training

Testing

LearnerModel parameters

Predictor

Linear Regressioncan be parameters of a

Neural Network

{𝒙1:𝑛, 𝒚1:𝑛}

Data points𝜽

𝑥𝑛+1, 𝜽 ො𝑦𝑛+1

EstimationI2DL: Prof. Niessner, Prof. Leal-Taixé 36

Page 34: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Input data, features

weights (i.e., model parameters)

input dimension

Linear Prediction• A linear model is expressed in the form

ො𝑦𝑖 =

𝑗=1

𝑑

𝑥𝑖𝑗𝜃𝑗

I2DL: Prof. Niessner, Prof. Leal-Taixé 37

Page 35: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

ො𝑦𝑖 = 𝜃0 +

𝑗=1

𝑑

𝑥𝑖𝑗𝜃𝑗 = 𝜃0 + 𝑥𝑖1𝜃1 + 𝑥𝑖2𝜃2 + ⋯+ 𝑥𝑖𝑑𝜃𝑑

bias

Linear Prediction• A linear model is expressed in the form

𝒙

𝒚

I2DL: Prof. Niessner, Prof. Leal-Taixé 38

𝜃0

Page 36: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Temperature of a building

Outside temperature

Number of people

Sun exposure

Level of humidity

Linear Prediction

𝜃1 𝜃3

𝜃2 𝜃4

𝑥1

𝑥2

𝑥3

𝑥4

I2DL: Prof. Niessner, Prof. Leal-Taixé 39

Page 37: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Linear Prediction

I2DL: Prof. Niessner, Prof. Leal-Taixé 40

ො𝑦1ො𝑦2⋮ො𝑦𝑛

= 𝜃0 +

𝑥11𝑥21

⋯⋯

𝑥1𝑑𝑥2𝑑

⋮ ⋱ ⋮𝑥𝑛1 ⋯ 𝑥𝑛𝑑

𝜃1𝜃2⋮𝜃𝑑

ො𝑦1ො𝑦2⋮ො𝑦𝑛

=

11⋮1

𝑥11𝑥21

⋯⋯

𝑥1𝑑𝑥2𝑑

⋮ ⋱ ⋮𝑥𝑛1 ⋯ 𝑥𝑛𝑑

𝜃0𝜃1⋮𝜃𝑑

ො𝐲 = 𝐗𝜽

Page 38: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Input features(one sample has 𝑑

features)

Model parameters

Prediction

Linear Prediction

ො𝐲 = 𝐗𝜽

I2DL: Prof. Niessner, Prof. Leal-Taixé 41

(𝑑 weights and 1 bias)

ො𝑦1ො𝑦2⋮ො𝑦𝑛

=

11⋮1

𝑥11𝑥21

⋯⋯

𝑥1𝑑𝑥2𝑑

⋮ ⋱ ⋮𝑥𝑛1 ⋯ 𝑥𝑛𝑑

𝜃0𝜃1⋮𝜃𝑑

Page 39: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Temperature of the building

Linear Prediction

I2DL: Prof. Niessner, Prof. Leal-Taixé 42

ො𝑦1ො𝑦2

=1 251 − 10

5050

20

5010

0.20.6401

0.14

MODEL

Page 40: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

ො𝑦1ො𝑦2

=1 251 − 10

5050

20

5010

0.20.6401

0.14

Temperature of the building

MODEL

Linear Prediction

I2DL: Prof. Niessner, Prof. Leal-Taixé 43

How do we obtain the

model?

Page 41: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

How to Obtain the Model?

I2DL: Prof. Niessner, Prof. Leal-Taixé 44

Data points

Model parameters

Labels (ground truth)

Estimation

Loss function

Optimization𝐗

𝜃 ො𝑦

𝑦

Page 42: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

How to Obtain the Model?

• Loss function: measures how good my estimation is (how good my model is) and tells the optimization method how to make it better.

• Optimization: changes the model in order to improve the loss function (i.e., to improve my estimation).

I2DL: Prof. Niessner, Prof. Leal-Taixé 45

Page 43: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Prediction: Temperature

of the building

Linear Regression: Loss Function

I2DL: Prof. Niessner, Prof. Leal-Taixé 46

Page 44: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Linear Regression: Loss Function

I2DL: Prof. Niessner, Prof. Leal-Taixé 47

Prediction: Temperature

of the building

Page 45: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

MinimizingObjective function

EnergyCost function

Linear Regression: Loss Function

𝐽 𝜽 =1

𝑛

𝑖=1

𝑛

ො𝑦𝑖 − 𝑦𝑖2

I2DL: Prof. Niessner, Prof. Leal-Taixé 48

Page 46: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Optimization: Linear Least Squares• Linear least squares: an approach to fit a linear model

to the data

• Convex problem, there exists a closed-form solution that is unique.

min𝜃

𝐽 𝜽 =1

𝑛

𝑖=1

𝑛

ො𝑦𝑖 − 𝑦𝑖2

I2DL: Prof. Niessner, Prof. Leal-Taixé 49

Page 47: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Optimization: Linear Least Squares

The estimation comes from the linear model

𝑛 training samples

=1

𝑛

𝑖=1

𝑛

𝐱𝑖𝜽 − 𝑦𝑖2

I2DL: Prof. Niessner, Prof. Leal-Taixé 50

min𝜽

𝐽 𝜽 =1

𝑛

𝑖=1

𝑛

ො𝑦𝑖 − 𝑦𝑖2

Page 48: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Optimization: Linear Least Squares

min𝜽

𝐽 𝜽 =1

𝑛

𝑖=1

𝑛

ො𝑦𝑖 − 𝑦𝑖2 =

1

𝑛

𝑖=1

𝑛

𝐱𝑖𝜽 − 𝑦𝑖2

min𝜽

𝐽 𝜽 = 𝐗𝜽 − 𝒚 𝑇(𝐗𝜽 − 𝒚)

𝑛 training samples, each input vector has

size 𝑑

𝑛 labels

Matrix notation

I2DL: Prof. Niessner, Prof. Leal-Taixé 51

Page 49: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Optimization: Linear Least Squares

min𝜽

𝐽 𝜽 =1

𝑛

𝑖=1

𝑛

ො𝑦𝑖 − 𝑦𝑖2 =

1

𝑛

𝑖=1

𝑛

𝐱𝑖𝜽 − 𝑦𝑖2

min𝜽

𝐽 𝜽 = 𝐗𝜽 − 𝒚 𝑇(𝐗𝜽 − 𝒚) Matrix notation

More on matrix notation in the next exercise session

I2DL: Prof. Niessner, Prof. Leal-Taixé 52

Page 50: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Optimization: Linear Least Squares

min𝜽

𝐽 𝜽 =1

𝑛

𝑖=1

𝑛

ො𝑦𝑖 − 𝑦𝑖2 =

1

𝑛

𝑖=1

𝑛

𝐱𝑖𝜽 − 𝑦𝑖2

min𝜽

𝐽 𝜽 = 𝐗𝜽 − 𝒚 𝑇(𝐗𝜽 − 𝒚)

53

Convex

Optimum

𝜕𝐽(𝜽)

𝜕𝜽= 0

I2DL: Prof. Niessner, Prof. Leal-Taixé 53

Page 51: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

True output: Temperature of

the building

Inputs: Outside temperature,

number of people, …

Optimization

We have found an analytical solution to a

convex problem

Details in the exercise session!𝜕𝐽(𝜃)

𝜕𝜃= 2𝐗𝑇𝐗𝜽 − 2𝐗𝑇𝐲 = 0

𝜃 = 𝐗𝑇𝐗 −1𝐗𝑇𝐲

I2DL: Prof. Niessner, Prof. Leal-Taixé 55

Page 52: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Is this the best Estimate?• Least squares estimate

𝐽 𝜽 =1

𝑛

𝑖=1

𝑛

ො𝑦𝑖 − 𝑦𝑖2

I2DL: Prof. Niessner, Prof. Leal-Taixé 56

Page 53: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Maximum Likelihood

I2DL: Prof. Niessner, Prof. Leal-Taixé 57

Page 54: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Maximum Likelihood Estimate

Controlled by parameter(s)

Parametric family of distributions

𝑝𝑑𝑎𝑡𝑎(𝐲|𝐗)

𝑝𝑚𝑜𝑑𝑒𝑙(𝐲|𝐗, 𝜽)

I2DL: Prof. Niessner, Prof. Leal-Taixé 58

True underlying distribution

Page 55: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Maximum Likelihood Estimate• A method of estimating the parameters of a statistical

model given observations,

Observations from 𝑝𝑑𝑎𝑡𝑎(𝐲|𝐗)

𝑝𝑚𝑜𝑑𝑒𝑙(𝐲|𝐗, 𝜽)

I2DL: Prof. Niessner, Prof. Leal-Taixé 59

Page 56: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Maximum Likelihood Estimate• A method of estimating the parameters of a statistical

model given observations, by finding the parameter values that maximize the likelihood of making the observations given the parameters.

𝜽𝑴𝑳 = arg max𝜽

𝑝𝑚𝑜𝑑𝑒𝑙(𝐲|𝐗, 𝜽)

I2DL: Prof. Niessner, Prof. Leal-Taixé 61

Page 57: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Maximum Likelihood Estimate• MLE assumes that the training samples are

independent and generated by the same probability distribution

𝑝𝑚𝑜𝑑𝑒𝑙 𝐲 𝐗, 𝜽 =ෑ

𝑖=1

𝑛

𝑝𝑚𝑜𝑑𝑒𝑙(𝑦𝑖|𝐱𝑖 , 𝜽)

I2DL: Prof. Niessner, Prof. Leal-Taixé 62

“i.i.d.” assumption

Page 58: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Maximum Likelihood Estimate

𝜽𝑴𝑳 = arg max𝜽

𝑖=1

𝑛

𝑝𝑚𝑜𝑑𝑒𝑙(𝑦𝑖|𝐱𝑖 , 𝜽)

𝜽𝑴𝑳 = arg max𝜽

𝑖=1

𝑛

log 𝑝𝑚𝑜𝑑𝑒𝑙(𝑦𝑖|𝐱𝑖 , 𝜽)

Logarithmic property log 𝑎𝑏 = log 𝑎 + log 𝑏

I2DL: Prof. Niessner, Prof. Leal-Taixé 63

Page 59: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

𝜽𝑴𝑳 = arg max𝜽

𝑖=1

𝑛

log 𝑝𝑚𝑜𝑑𝑒𝑙(𝑦𝑖|𝐱𝑖 , 𝜽)

Back to Linear Regression

What shape does our probability distribution

have?

I2DL: Prof. Niessner, Prof. Leal-Taixé 64

Page 60: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

What shape does our probability distribution have?

𝑝(𝑦𝑖|𝐱𝑖 , 𝜽)

I2DL: Prof. Niessner, Prof. Leal-Taixé 65

Back to Linear Regression

Page 61: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Assuming

Gaussian / Normal distribution𝑝(𝑦𝑖|𝐱𝑖 , 𝜽)

𝑦𝑖 = 𝒩 𝐱𝑖𝜽, 𝜎2 = 𝐱𝑖𝜽 +𝒩(0, 𝜎2)

meanGaussian:

𝑝 𝑦𝑖 =1

2𝜋𝜎2𝑒−

12𝜎2

𝑦𝑖−𝜇2

𝑦𝑖 ~𝒩(𝜇, 𝜎2)

I2DL: Prof. Niessner, Prof. Leal-Taixé 66

Back to Linear Regression

Page 62: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Assuming

𝑝 𝑦𝑖 𝐱𝑖 , 𝜽 = ?

𝑦𝑖 = 𝒩 𝐱𝑖𝜽, 𝜎2 = 𝐱𝑖𝜽 +𝒩(0, 𝜎2)

𝑝 𝑦𝑖 =1

2𝜋𝜎2𝑒−

12𝜎2

𝑦𝑖−𝜇2

𝑦𝑖 ~𝒩(𝜇, 𝜎2)

I2DL: Prof. Niessner, Prof. Leal-Taixé 67

Back to Linear Regression

meanGaussian:

𝑝 𝑦𝑖 =1

2𝜋𝜎2𝑒−

12𝜎2

𝑦𝑖−𝜇2

𝑦𝑖 ~𝒩(𝜇, 𝜎2)

Page 63: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

𝑝 𝑦𝑖 𝐱𝑖 , 𝜽 = 2𝜋𝜎2 −1/2𝑒−

12𝜎2

𝑦𝑖−𝐱𝒊𝜽2

Assuming 𝑦𝑖 = 𝒩 𝐱𝑖𝜽, 𝜎2 = 𝐱𝑖𝜽 +𝒩(0, 𝜎2)

𝑝 𝑦𝑖 =1

2𝜋𝜎2𝑒−

12𝜎2

𝑦𝑖−𝜇2

𝑦𝑖 ~𝒩(𝜇, 𝜎2)

I2DL: Prof. Niessner, Prof. Leal-Taixé 68

Back to Linear Regression

meanGaussian:

𝑝 𝑦𝑖 =1

2𝜋𝜎2𝑒−

12𝜎2

𝑦𝑖−𝜇2

𝑦𝑖 ~𝒩(𝜇, 𝜎2)

Page 64: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Back to Linear Regression

𝑝 𝑦𝑖 𝐱𝑖 , 𝜽 = 2𝜋𝜎2 −1/2𝑒−

12𝜎2

𝑦𝑖−𝐱𝒊𝜽2

Original optimization problem

𝜽𝑴𝑳 = arg max𝜽

𝑖=1

𝑛

log 𝑝𝑚𝑜𝑑𝑒𝑙(𝑦𝑖|𝐱𝑖 , 𝜽)

I2DL: Prof. Niessner, Prof. Leal-Taixé 69

Page 65: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Back to Linear Regression

𝑖=1

𝑛

log 2𝜋𝜎2 −12 𝑒

−1

2𝜎2𝑦𝑖−𝒙𝒊𝜽

2

Matrix notation

Canceling log and 𝑒

−𝑛

2log 2𝜋𝜎2 −

1

2𝜎2𝒚 − 𝑿𝜽 𝑇 𝒚 − 𝑿𝜽

I2DL: Prof. Niessner, Prof. Leal-Taixé 70

𝑖=1

𝑛

−1

2log 2𝜋𝜎2 +

𝑖=1

𝑛

−1

2𝜎2𝑦𝑖 − 𝒙𝒊𝜽

2

Page 66: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

𝜃𝑀𝐿 = arg max𝜃

𝑖=1

𝑛

log 𝑝𝑚𝑜𝑑𝑒𝑙(𝑦𝑖|𝐱𝑖 , 𝜽)

Back to Linear Regression

How can we find the estimate of theta?

Details in the exercise session!

−𝑛

2log 2𝜋𝜎2 −

1

2𝜎2𝐲 − 𝐗𝜽 𝑇 𝐲 − 𝐗𝜽

𝜕𝐽(𝜽)

𝜕𝜽= 0

𝜽 = 𝑿𝑇𝑿 −1𝑿𝑇𝐲

I2DL: Prof. Niessner, Prof. Leal-Taixé 71

Page 67: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Linear Regression• Maximum Likelihood Estimate (MLE) corresponds to

the Least Squares Estimate (given the assumptions)

• Introduced the concepts of loss function and optimization to obtain the best model for regression

I2DL: Prof. Niessner, Prof. Leal-Taixé 72

Page 68: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Image Classification

I2DL: Prof. Niessner, Prof. Leal-Taixé 73

Page 69: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Regression vs Classification

• Regression: predict a continuous output value (e.g., temperature of a room)

• Classification: predict a discrete value – Binary classification: output is either 0 or 1– Multi-class classification: set of N classes

I2DL: Prof. Niessner, Prof. Leal-Taixé 74

Page 70: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Logistic Regression

CAT classifier

I2DL: Prof. Niessner, Prof. Leal-Taixé 75

Page 71: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Sigmoid for Binary Predictions

Can be interpreted as a probability

1

0

𝑥0

𝑥1

𝑥2

𝜃1

𝜃0

𝜃2

Σ

𝜎 𝑥 =1

1 + 𝑒−𝑥

𝑝(𝑦𝑖 = 1|𝐱𝑖 , 𝜽)

I2DL: Prof. Niessner, Prof. Leal-Taixé 76

Page 72: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Spoiler Alert: 1-Layer Neural Network

1

0

𝑥0

𝑥1

𝑥2

𝜃1

𝜃0

𝜃2

Σ

𝜎 𝑥 =1

1 + 𝑒−𝑥

𝑝(𝑦𝑖 = 1|𝐱𝑖 , 𝜽)

I2DL: Prof. Niessner, Prof. Leal-Taixé 77

Can be interpreted as a probability

Page 73: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Logistic Regression• Probability of a binary output

The prediction of our sigmoid

ො𝐲 = 𝑝 𝐲 = 1 𝐗, 𝜽 =ෑ

𝑖=1

𝑛

𝑝(𝑦𝑖 = 1|𝐱𝑖 , 𝜽)

ො𝑦𝑖 = 𝜎(𝐱𝑖𝜽)

I2DL: Prof. Niessner, Prof. Leal-Taixé 78

Page 74: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Logistic Regression• Probability of a binary output

ො𝐲 = 𝑝 𝐲 = 1 𝐗, 𝜽 =ෑ

𝑖=1

𝑛

𝑝(𝑦𝑖 = 1|𝐱𝑖 , 𝜽)

Model for coins

Bernoulli trial

The prediction of our sigmoid

𝑝 𝑧 𝜙 = 𝜙𝑧 1 − 𝜙 1−𝑧 = ቊ𝜙 , if 𝑧 = 11 − 𝜙, if 𝑧 = 0

I2DL: Prof. Niessner, Prof. Leal-Taixé 79

Page 75: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Logistic Regression• Probability of a binary output

ො𝐲 = 𝑝 𝐲 = 1 𝐗, 𝜽 =ෑ

𝑖=1

𝑛

𝑝(𝑦𝑖 = 1|𝐱𝑖 , 𝜽)

Model for coinsො𝐲 =ෑ

𝑖=1

𝑛

ො𝑦𝑖𝑦𝑖 1 − ො𝑦𝑖

(1−𝑦𝑖)

True labels: 0 or 1Prediction of the Sigmoid: continuous I2DL: Prof. Niessner, Prof. Leal-Taixé 80

Page 76: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Logistic Regression: Loss Function• Probability of a binary output

• Maximum Likelihood Estimate

𝑝 y 𝐗, 𝜽 = ො𝐲 =ෑ

𝑖=1

𝑛

ො𝑦𝑖𝑦𝑖 1 − ො𝑦𝑖

(1−𝑦𝑖)

𝜽𝑴𝑳 = arg max𝜽

log 𝑝 y 𝐗, 𝜽

I2DL: Prof. Niessner, Prof. Leal-Taixé 81

Page 77: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Logistic Regression: Loss Function

𝑝 y 𝐗, 𝜽 = ො𝐲 =ෑ

𝑖=1

𝑛

ො𝑦𝑖𝑦𝑖 1 − ො𝑦𝑖

(1−𝑦𝑖)

𝑖=1

𝑛

log ො𝑦𝑖𝑦𝑖 1 − ො𝑦𝑖

(1−𝑦𝑖)

𝑖=1

𝑛

𝑦𝑖 log ො𝑦𝑖 + (1 − 𝑦𝑖) log(1 − ො𝑦𝑖)

I2DL: Prof. Niessner, Prof. Leal-Taixé 82

Page 78: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Logistic Regression: Loss Function

ℒ ො𝑦𝑖 , 𝑦𝑖 = 𝑦𝑖 log ො𝑦𝑖 + (1 − 𝑦𝑖) log(1 − ො𝑦𝑖)

𝑦𝑖 = 1 ℒ ො𝑦𝑖 , 1 = log ො𝑦𝑖

I2DL: Prof. Niessner, Prof. Leal-Taixé 83

Maximize! 𝜽𝑴𝑳 = arg max𝜽

log 𝑝 y 𝐗, 𝜽

Page 79: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Logistic Regression: Loss Function

ℒ ො𝑦𝑖 , 𝑦𝑖 = 𝑦𝑖 log ො𝑦𝑖 + (1 − 𝑦𝑖) log(1 − ො𝑦𝑖)

𝑦𝑖 = 1 ℒ ො𝑦𝑖 , 1 = log ො𝑦𝑖

We want log ො𝑦𝑖 large; since logarithm is a monotonically increasing function, we also want

large ො𝑦𝑖 .

(1 is the largest value our model’s estimate can take!) I2DL: Prof. Niessner, Prof. Leal-Taixé 84

Page 80: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Logistic Regression: Loss Function

ℒ ො𝑦𝑖 , 𝑦𝑖 = 𝑦𝑖 log ො𝑦𝑖 + (1 − 𝑦𝑖) log(1 − ො𝑦𝑖)

We want log 1 − ො𝑦𝑖 large; so we want ො𝑦𝑖 to be small

(0 is the smallest value our model’s estimate can take!)

𝑦𝑖 = 0 ℒ ො𝑦𝑖 , 0 = log 1 − ො𝑦𝑖

𝑦𝑖 = 1 ℒ ො𝑦𝑖 , 1 = log ො𝑦𝑖

I2DL: Prof. Niessner, Prof. Leal-Taixé 85

Page 81: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Logistic Regression: Loss Function

• Related to the multi-class loss you will see in this course (also called softmax loss)

Referred to as binary cross-entropy loss (BCE)

ℒ ො𝑦𝑖 , 𝑦𝑖 = 𝑦𝑖 log ො𝑦𝑖 + (1 − 𝑦𝑖) log(1 − ො𝑦𝑖)

I2DL: Prof. Niessner, Prof. Leal-Taixé 86

Page 82: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

𝐶 𝜃 = −1

𝑛

𝑖=1

𝑛

ℒ ො𝑦𝑖 , 𝑦𝑖

= −1

𝑛

𝑖=1

𝑛

𝑦𝑖 log ො𝑦𝑖 + (1 − 𝑦𝑖) log(1 − ො𝑦𝑖)

Logistic Regression: Optimization• Loss function

• Cost function

Minimization

ℒ ො𝑦𝑖 , 𝑦𝑖 = 𝑦𝑖 log ො𝑦𝑖 + (1 − 𝑦𝑖) log(1 − ො𝑦𝑖)

ො𝑦𝑖 = 𝜎(𝐱𝑖𝜽)

I2DL: Prof. Niessner, Prof. Leal-Taixé 87

Page 83: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Logistic Regression: Optimization• No closed-form solution

• Make use of an iterative method gradient descent

Gradient descent –later on!

I2DL: Prof. Niessner, Prof. Leal-Taixé 89

Page 84: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Why Machine Learning so Cool• We can learn from experience

-> Intelligence, certain ability to infer the future!

• Even linear models are often pretty good for complex phenomena: e.g., weather:– Linear combination of day-time, day-year etc. is often

pretty good

I2DL: Prof. Niessner, Prof. Leal-Taixé 115

Page 85: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Many Examples of Logistic Regression

• Coronavirus models behave like logistic regressions– Exponential spread at beginning– Plateaus when certain portion of pop. is infected/immune

I2DL: Prof. Niessner, Prof. Leal-Taixé 116https://www.worldometers.info/coronavirus

Page 86: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Many Examples of Logistic Regression

• Coronavirus models behave like logistic regressions– Exponential spread at beginning– Plateaus when certain portion of pop. is infected/immune

I2DL: Prof. Niessner, Prof. Leal-Taixé 117https://www.worldometers.info/coronavirus

Think about good features:- Total population- Population density- Implementation of Measures- Reasonable government ?- Etc. (many more of course)

Page 87: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

The Model Matters

• Each case requires different models; linear vs logistic

• Many models:– #coronavirus_infections cannot be > #total_population– Munich housing prizes seem exponential though

• No hard upper bound -> prizes can always grow!

I2DL: Prof. Niessner, Prof. Leal-Taixé 118

Page 88: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

Next Lectures

• Next exercise session: Math Recap II

• Next Lecture: Lecture 3:– Jumping towards our first Neural Networks and

Computational Graphs

I2DL: Prof. Niessner, Prof. Leal-Taixé 119

Page 89: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

References for further Reading• Cross validation:

– https://medium.com/@zstern/k-fold-cross-validation-explained-5aeba90ebb3

– https://towardsdatascience.com/train-test-split-and-cross-validation-in-python-80b61beca4b6

• General Machine Learning book:– Pattern Recognition and Machine Learning. C. Bishop.

I2DL: Prof. Niessner, Prof. Leal-Taixé 120

Page 90: Machine Learning Basics · Machine Learning Basics I2DL: Prof. Niessner, Prof. Leal-Taixé 1. Machine Learning Task ... Basic Recipe for Machine Learning •Split your data I2DL:

See you next week

I2DL: Prof. Niessner, Prof. Leal-Taixé 121