An (very brief) introduction - ESO€¦ · Prof. Dr. Laura Leal-Taixé An (very brief)...

Prof. Dr. Laura Leal-Taixé

www.dvl.in.tum.de

An (very brief) introduction to Deep Learning

for Computer

Vision

Computer Vision

Give eyes to a computer

Computer Vision

Understand every pixel of an image

Computer Vision

Semantic segmentation

personcar

Computer Vision

person 1

person 2

person 3

Instance-based

segmentation

Computer Vision

Understand every pixel of a video

Instance-based

segmentation

Multiple object

tracking

Dynamic Scene Understanding

Understand every pixel of a video

Instance-based

segmentation

Multiple object

tracking

Artificial Intelligence

Data, examples

Computational models

Perform a given task on new unseen data

LEARNING, TRAINING

Artificial Intelligence

Data, examples

Computational models

Perform a given task on new unseen data

Expert eye

Self-driving cars

AI nowadays

What is Deep Learning, really?

Each node is a small classifier

What is Deep Learning, really?

Each node is a small classifier

Each classifier makes tiny decision

Has two eyes

Has a tail

Has paws

Has two ears These are learned

from data!

Convolutions on Images

-5 3 2 -5 34 3 2 1 -31 0 3 3 5-2 0 1 4 45 6 7 9 -1

0 -1 0-1 5 -10 -1 0

5 ⋅ 3 + −1 ⋅ 3 + −1 ⋅ 2 + −1 ⋅ 0 + −1 ⋅ 4 =15 − 9 = 6

-5 3 2 -5 34 3 2 1 -31 0 3 3 5-2 0 1 4 45 6 7 9 -1

0 -1 0-1 5 -10 -1 0

5 ⋅ 2 + −1 ⋅ 2 + −1 ⋅ 1 + −1 ⋅ 3 + −1 ⋅ 3 =10 − 9 = 1

-5 3 2 -5 34 3 2 1 -31 0 3 3 5-2 0 1 4 45 6 7 9 -1

0 -1 0-1 5 -10 -1 0

5 ⋅ 1 + −1 ⋅ (−5) + −1 ⋅ (−3) + −1 ⋅ 3 + −1 ⋅ 2 =5 + 3 = 1

-5 3 2 -5 34 3 2 1 -31 0 3 3 5-2 0 1 4 45 6 7 9 -1

0 -1 0-1 5 -10 -1 0

6 1 8-7

5 ⋅ 0 + −1 ⋅ 3 + −1 ⋅ 0 + −1 ⋅ 1 + −1 ⋅ 3 =0 − 7 = −7

-5 3 2 -5 34 3 2 1 -31 0 3 3 5-2 0 1 4 45 6 7 9 -1

0 -1 0-1 5 -10 -1 0

6 1 8-7 9

5 ⋅ 3 + −1 ⋅ 2 + −1 ⋅ 3 + −1 ⋅ 1 + −1 ⋅ 0 =15 − 6 = 9

-5 3 2 -5 34 3 2 1 -31 0 3 3 5-2 0 1 4 45 6 7 9 -1

0 -1 0-1 5 -10 -1 0

6 1 8-7 9 2

5 ⋅ 3 + −1 ⋅ 1 + −1 ⋅ 5 + −1 ⋅ 4 + −1 ⋅ 3 =15 − 13 = 2

-5 3 2 -5 34 3 2 1 -31 0 3 3 5-2 0 1 4 45 6 7 9 -1

0 -1 0-1 5 -10 -1 0

6 1 8-7 9 2-5

5 ⋅ 0 + −1 ⋅ 0 + −1 ⋅ 1 + −1 ⋅ 6+ −1 ⋅ (−2) = −5

-5 3 2 -5 34 3 2 1 -31 0 3 3 5-2 0 1 4 45 6 7 9 -1

0 -1 0-1 5 -10 -1 0

6 1 8-7 9 2-5 -9

5 ⋅ 1 + −1 ⋅ 3 + −1 ⋅ 4 + −1 ⋅ 7 + −1 ⋅ 0 =5 − 14 = −9

-5 3 2 -5 34 3 2 1 -31 0 3 3 5-2 0 1 4 45 6 7 9 -1

0 -1 0-1 5 -10 -1 0

6 1 8-7 9 2-5 -9 3

5 ⋅ 4 + −1 ⋅ 3 + −1 ⋅ 4 + −1 ⋅ 9 + −1 ⋅ 1 =20 − 17 = 3

Image filters

• Each kernel gives us a different image filter

Prof. Leal-Taixé and Prof. Niessner24

Edge detection

−1 −1 −1−1 8 −1−1 −1 −1

Sharpen

0 −1 0−1 5 −10 −1 0

Box mean

191 1 11 1 11 1 1

Gaussian blur

1 2 12 4 21 2 1

LET’S L

EARN T

HESE FIL

Convolutions on RGB Images

32×32×3 image (pixels %)

5×5×3 filter (weights &)

activation map(also feature map)

Convolve

Convolution Layer

32×32×3 image

5×5×3 filter

activation maps

Convolve

Let’s apply a different filterwith different weights!

Convolution Layer

32×32×3 image

activation maps

Convolve

Let’s apply **five** filters,each with different weights!

Convolution “Layer”

Visualizing a CNN

Low-level

features

Mid-level

features

Object

Big Data

Models know where

to learn from

Hardware

Models are

trainable

Models are

complex

What made Deep Learning possible?

Big data

ImageNet: Goal 10.000 images per 100.000 words

Deep Learning: what is it good at?

Input A Response B

English sentence

Picture

Audio clip

French sentence

Photo tagging

Transcript

Machine translation

Face recognition

Speech recognition

Supervised learning

How to obtain the model?

Data points

Model parameters

Labels (ground truth)

Estimation

Loss function

Optimization

Deep Learning for Image Classification

Classification

Localization

Object

detection

Instance

segmentation

Instance segmentation with Mask-RCNN

K. He, G. Gkioxari, P. Dollar, R. Girshick. Mask R-CNN. ICCV 2017.

Instance segmentation with Mask-RCNN

K. He, G. Gkioxari, P. Dollar, R. Girshick. Mask R-CNN. ICCV 2017.

Manipulating images

Jun-Yan Zhu*, Taesung Park*, Phillip Isola, and Alexei A. Efros. "Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks", ICCV, 2017.

Manipulating images

Jun-Yan Zhu*, Taesung Park*, Phillip Isola, and Alexei A. Efros. "Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks", ICCV, 2017.

The world is dynamic!

• Deep Learning pushed single-image analysis to a point where results are usable in real-world scenarios

• The world is not static

What we still need in ML: good memory models

Multiple object tracking

Goal: detect and track all objects in a

Video object segmentation

TecoGANLowRes

M. Chu, Y. Xie, L. Leal-Taixé and N. Thuerey. „Temporally Coherent GANs for Video Super-Resolution (TecoGAN)“

Video super resolution

TecoGANLowRes

Video super resolution

TecoGAN results

MapObtain the camera

pose with respect to

the map

Visual localization

Visual Localization: applications

• Robot navigation

• Augmented reality

[Middelberg, Sattler, Untzelmann, Kobbelt, Scalable 6-DOF Localization on Mobile Devices. ECCV 2014]

The challenge: Big data

• We need data, lots of data!

• We might not have the luxury of data for some applications such as medical diagnosis

• Data is biased

Big data

Data bias• Increase diversity in

the data

• Increase diversity in the AI community which is building the algorithms

Data bias

The generalization problem

• Neural networks are GREAT at finding patterns in data they have seen, but not so great at generalizing to new scenarios

True intelligence is still far away

Prof. Dr. Laura Leal-Taixé

www.dvl.in.tum.de

Thank you

https://dvl.in.tum.deIf you have images, contact us!

An (very brief) introduction - ESO€¦ · Prof. Dr. Laura Leal-Taixé An (very brief)...

Documents

Transcript of An (very brief) introduction - ESO€¦ · Prof. Dr. Laura Leal-Taixé An (very brief)...

PORTABLE XRF: A VERY BRIEF INTRODUCTION

A (very) brief history of Internet safety

A Very Brief Visual History of Cinema

Kotlin compiler construction (very brief)

A Very Brief History of (Neuro) Marketing

Adolescent Development A (very) Brief Overview

A Very Brief Introduction to CDISC- Globalhealthtrials

A VERY BRIEF HISTORY OF CAT SHOWS - TICA U · A VERY BRIEF HISTORY OF CAT SHOWS 2003, Sarah Hartwell This is "very brief history" provides a background to some of the "retrospective"

Integral Education and the Brain; A very very brief introduction

Irish Leaders: A Very Brief Look

Very brief history of Time

A Very Brief History of Psychology

Geologic History Very Brief Version

A Brief, Very Very Brief Intro to Systems Thinking

INFORMAL FALLACIES A very brief introduction

A VERY BRIEF INTRODUCTION

Very Brief Ntro to Particle Physics _Pearce

A very brief history of GITJ

Follow-up, compliance and analysis Analysis (very brief):Analysis (very brief): –Standard analysis –More exotic stuff Compliance/adherenceCompliance/adherence.

Tomcat Configuration A Very, Very, Very Brief Overview.