Leveraging PowerVR GPU Compute for Automotive...

30
www.imgtec.com Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging PowerVR GPU Compute for Automotive Convolutional Neural Networks

Transcript of Leveraging PowerVR GPU Compute for Automotive...

Page 1: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

www.imgtec.com

Bryce Johnstone & Paul Brasnett

9 Nov 2016

Leveraging PowerVR GPU Compute for Automotive Convolutional Neural Networks

Page 2: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 2

Agenda

About Imagination Technologies

ADAS/Autonomous Driving : why?

Why PowerVR GPUs for Vision Processing?

Implementing Convolutional Neural Networks (CNNs)

Performance Analysis

Conclusions

Leveraging GPU Compute for Automotive CNNs

Page 3: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 3

Core IP for low power, high performance SoCs

Ultra-low power; class-leading efficiency; designed for IP-based SoCs

Our technologies address what really matters to help our customers create innovations for success

PowerVR Graphics & GPU Compute

Processors

Ensigma Communications

Processors

PowerVR Vision

Processors

MIPS Processors

Fabric

PowerVR Video

Processors

Page 4: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 4

Enabling customers to fully leverage their own IP

Domain Solutions Customer

technologies & know-how

Customizable IP platforms

Scalable IP

AR / VR Networking IoT Consumer Automotive Mobile

Ecosystems software, tools, apps, middleware, hardware

Page 5: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 5

Autonomous Driving and ADAS

Reduce road deaths/GDP costs WW. 1.2m in

2015

Increase road utilisation – 2x with 80%

Autonomous cars

Reduce congestion & parking time

US/EU already driving legislation change to

support the nascent market

Issues of liability, safety, security will have to

be resolved before wide adoption

Complex vision processing (deep learning/AI)

needs increasing rapidly

Platooning -> autotaxi/lift -> Semi -> fully

autonomous

ADAS is the backbone for Autonomous Driving

Today

ASSIST • Driver active

• Fail Safe

2020

AUTOMATE • Sensor Fusion

• Co-pilot

2030

AUTONOMOUS • 3D Maps

• Driverless

• Hands off / mind

off

Page 6: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 6

ADAS:Levels of Processing From Sensor to actuator

Action

Low Level Processing

Intermediate Level Processing

High Level Processing

Control Logic

Pixel Processing • Hundreds of millions of pixels

per second • Similar processing per pixel

Object Processing • Thousands of objects per second • Similar processing per object

Object Recognition • Dozens of objects per second

Sensor Fusion Decision Making Application Control MIPS

Prop HWA

As complexity increases, specifically designed hardware acceleration allow for best performance and most power efficiency

GPU

Compute

Sensor Data

Visual

Actuator

ADAS -> Fully autonomous Orders of magnitude increase in processing

Page 7: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 7

Automotive GPU requirements Wide Range of Possibilities

Single Screen

Low Resolution

Single Task

Digital/Mechanical

Single/Multi-Screen

High Resolution

Multi-Task (HMI/Entertainment)

Full Digital

Basic ADAS Functions

Multi-Screen/HUD

High Retina Resolution

Virtualised Multi-Tasking

Full Digital

Full ADAS Functionality

Entry Level Mid Range High End

PowerVR GPU Series 6XE/7XE

• Low res 2D/3D UIs • Small silicon area • low power & memory • High end functionality

PowerVR GPU Series 6/7XT

• 4-16 Cluster • Virtualization • Ray Tracing-photorealism • TFLOPs GPU Compute • FP16 & FP32

PowerVR GPU Series 7XE/8XE

• Entry Level 3D UI • Advanced 3D graphics • Simple GPU Compute • Basic ADAS • Secure multi-tasking

Page 8: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 8

Evolution of Compute GPU APIs

OpenCL 1.2

OpenCV

OpenVX

Vulkan

OpenCL 2.0

Full Profile

New APIs :

OpenGL ES SC

Page 9: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 9

Why PowerVR GPUs for Vision Processing?

CPUs can generate large amounts of heat

• CPUs can deliver high peak/burst

performance

• But generate large amounts of heat

• PowerVR GPUs provide

• Lowest power FP16/FP32 & int

pipelines

• Local memory for highly efficient data

access for compute operations

• Power-saving features such as gating

of non-compute parts of GPU for

efficient compute operation

CPU

GP

U

Page 10: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 10

Why GPUs for Vision Processing?

Provence(raytracing)

Particle Simulation –

32k

Particle Simulation –

4k Julia Set

AmbientOcclusion

Denoise Gaussian Blur

CPU 100.00% 100% 100% 100% 100% 100% 100%

PowerVR Series6 265% 407% 517% 963% 1126% 482% 383%

0%

100%

200%

300%

400%

500%

600%

Perf

orm

ance

rel

ativ

e t

o C

PU

Page 11: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 11

Why CNNs?

State-of-the-art performance

Rapid development cycles

Range of vision tasks

Classification

Detection

Segmentation

Recognition

Tracking

Feature detection

Feature description

Other tasks…

Camera Localisation

PoseNet: A Convolutional Network for Real-Time 6-DOF Camera

Relocalization, Kendall, A., Grimes, M., Cipolla, R., ICCV 2015

Page 12: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 12

CNN uses in Autonomous Driving

Pedestrian/cyclist/motorcyclist

detection

Sign detection & classification

Road user detection

Driver monitoring

Vehicle occupancy classification

Drivable path analysis

Road scene understanding

Page 13: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 13

What is a CNN?

Convolution Activation Normalization Pooling Fully Connected

CNN Architecture Basic Building Blocks

Soft Max

Page 14: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 14

What is a CNN? Convolution layer

Input Image Convolution

coefficients

Output

Page 15: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 15

What is a CNN?

Convolution Activation Normalization Pooling Fully Connected

Convolution Image Activation Pooling

Fully Connected

CNN Architecture Basic Building Blocks

CNN Example Network

Normalization

Soft Max

Convolution Activation Pooling

Convolution Activation Pooling Soft Max

Page 16: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 16

CNN Object Classification

Training — Offline

Architecture

Data CNN Library Compute + Time Model Coefficients

Page 17: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 17

CNN Object Classification

Training — Offline

Inference — Online

Architecture

Data CNN Library Compute + Time Model Coefficients

Architecture

Model Coefficients

Page 18: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 18

CNN Object Classification

Training — Offline

Inference — Online

Architecture

Data CNN Library Compute + Time Model Coefficients

Architecture

Model Coefficients

Image

CNN Library Compute Classification

PowerVR GPU

Page 19: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 19

Coefficients by layer type

Where is the Cost in CNN Inference? Number of operations and coefficients required by layer type for Alexnet

Operations by layer type

Convolutions

Pooling

Normalisation

Fully Connected

Page 20: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 20

Coefficients by layer type

Where is the Cost in CNN Inference? Number of operations and coefficients required by layer type for Alexnet

Operations by layer type

Convolutions

Pooling

Normalisation

Fully Connected

Page 21: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 21

Coefficients by layer type

Where is the Cost in CNN Inference? Number of operations and coefficients required by layer type for Alexnet

Operations by layer type

Convolutions

Pooling

Normalisation

Fully Connected

Page 22: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 22

Convolutions - Matrix Multiply

Create as many work-items as is size of output matrix

Each work-item will read it’s row and column and produce dot product

Requires large number of accesses to memory

Naïve Implementation

x =

A B C

Page 23: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 23

Convolutions - Matrix Multiply

Tiling Approach

0.1

1

10

100

1000

Tim

e (

s)

Matrix Size

Naïve

Tiled matrix multiply

Page 24: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 24

Convolutions – Frequency Domain Example number of operations (Mflops) required to implement convolutions

Implementation

Filter Size

AlexNet/conv3

(3x3)

AlexNet/conv2

(5x5)

GoogleNet/conv1

(7x7)

Matrix Multiply 299 448 236

Frequency Domain 90 55 79

Convolution in time domain corresponds to multiplication in frequency

domain

Page 25: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 25

Performance Analysis — GPU v CPU*

* CPU results based on Caffe (with ATLAS)

0.1

1

10

100

Convolutions Pooling Normalisation FullyConnected

Rela

tive F

PS

Perf

orm

an

ce

(Hig

her

is b

ett

er)

Alexnet

CPU (1.6GHz)

PowerVR 2 Cluster GPU(384MHz) - MatMul

Page 26: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 26

Performance Analysis — GPU v CPU*

* CPU results based on Caffe (with ATLAS)

0.1

1

10

100

Convolutions Pooling Normalisation Fully Connected

Rela

tive F

PS

Perf

orm

an

ce

(Hig

her

is b

ett

er)

Alexnet

CPU (1.6GHz)

PowerVR 2 ClusterGPU (384MHz) -MatMul

PowerVR 2 ClusterGPU (384MHz) -FFT

Page 27: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 27

Fully Connected Layers Low precision data types

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

float ushort uchar

Rela

tive F

PS

P

erf

orm

an

ce (

Hig

her

is

bett

er)

Fully connected weights data-type

Page 28: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 28

Conclusions

CNNs are an integral part of Computer Vision applications for Semi and

Autonomous cars

Numerous applications can be addressed with CNNs

PowerVR GPUs offer

upto 12x higher performance deployment for CNNs (GPU Compute)

Convolution performance can be improved using frequency domain

Fully connected layer performance can be improved by using low precision

data types

PowerVR GPUs scale to allow for higher levels of performance & lower

power for current and future generations of vision enabled products

Page 29: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

www.imgtec.com

Thank you

Confidential

Page 30: Leveraging PowerVR GPU Compute for Automotive ...imgtec.eetrend.com/sites/imgtec.eetrend.com/files/download/201611/8959-24251-cnn.pdf Bryce Johnstone & Paul Brasnett 9 Nov 2016 Leveraging

© Imagination Technologies CNNs in Automotive Webinar Nov 2016 30

Resources

PowerVR GPU Compute

https://imgtec.com/tools/powervr-gpu-compute/

Guide to writing OpenCL

http://blog.imgtec.com/powervr/a-quick-guide-to-writing-opencl-kernels-for-rogue

PowerVR Imaging Framework

http://blog.imgtec.com/powervr/powervr-imaging-framework-sdk

PowerVR CNN Demo

OpenCL Tutorial

https://handsonopencl.github.io/