GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC...

20
GPU accelerated HEVC decoder on Mali™ T600

Transcript of GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC...

Page 1: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

GPU accelerated HEVC decoder on Mali™ T600

Page 2: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

Ittiam Systems Introduction

2

DSP Professionals Survey by Forward Concepts

World’s most preferred DSP IP supplier

2004 2005 2006

DSP Systems IP CompanyMultimedia + Communication SystemsMultimedia Components, Systems, HardwareFocus on Broadcast, Video Communication, Video Security, Mobile

IP Licensing Business ModelFounded in 2001Venture fundedFlexible mix of one time fees and royalties for licensing

300+ licenseesWorldwideFortune 100 companies, Tier 1 OEMsConsistently rated as Most Preferred DSP IP Supplier

250 strong Engineering TeamWorld Class TalentDeep Multimedia and end application Expertise29 patents issued 30+ patents filed

Page 3: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

Ittiam Multimedia Overview

3

Multimedia Components

Middleware + SDKs

OEM Applications

Audio CodecsVideo Codecs/Image CodecsAlgorithms for Audio Effects, Acoustics, ImagingARM® CPU , NEON™ OptimizedDSP+HW Accelerators + GPU expertise and capabilities

System components Parsers, Creators, Stacks, SubtitlesMultimedia Integration Android, Other FrameworksUse Case validation Enhancements to existing MiddlewareApplication Specific SDKs

Complete Multimedia ApplicationsCovers major Multimedia Use CasesCamera, Gallery, Editor, Players, Video EditorProduction testedCustomizable to requirements

4x

Page 4: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

Ittiam Multimedia Solutions and ARM

4

Strategic Platform

Long Investment

Partnership

Focus on Mobile, Home, Portable segmentsARM® Connected Community MemberStrong Portfolio of IPExpertise in ARM architecture and optimizations for ARM

Many years of development on ARM® PlatformsCovering ARM9E, ARM11, Cortex™ A8, A9, A15, A5, A7, A12 and NEON™In house developed reference C models for all IPEfficient, targeted for ARM, validated across multiple generations

Joint Benchmarking of implementationsEarly Access to Mali™/OpenCL™ informationEarly involvement on new platforms

Page 5: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

Ittiam Media Processing Elements

Audio Codes Video CodecsStereo and Multichannel MP12, AAC- LC/HE v1&v2, AC3, DD+High Quality ResamplerPost Processing and Audio Effects Field Proven

MPEG2, MPEG-4, H.264 , HEVC / H.265 Scalable across Multiple ARM CoresOptimized for bandwidth and CPU + NEONError Resilience for Streaming Use cases In Production

Acoustics

Sin

Voice Quality Enhancements with Echo Cancellation/ AEC), Noise Reduction/ANREqualizer for Microphone & SpeakerAGC , AVC , Audio De-Reverb Mic Beam Forming

De-noise, Face detection, Red-eye correctionPanorama, HDR, Low Light, 3DB&W, Sepia, Cross ProcessExposure, Colours, Geometric, Filters

5

Image Processing

Page 6: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

HEVC Overview

Page 7: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

HEVC / H.265 Sandard

HEVC aka H.265 is a video compression standard, jointly developed by ISO/IEC MPEG and ITU-T VCEG

MPEG and VCEG have established a Joint Collaborative Team on Video Coding (JCT-VC) to develop the HEVC standard

HEVC is a successor to H.264 standard

HEVC can support ultra high resolutions upto 8192 x 4320 pixels

HEVC offers substantially higher video compression ratio compared to existing standards

Page 8: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

H.265 vs H.264

Tool H.264 H.265

Coding unit 16x16 macroblocksBlock coding Structure

Coding tree blocks (64x64)Quadtree coding structure

Transforms 4x4 and 8x8 4x4, 8x8, 16x16 and 32x32

Inter Prediction 4x4 to 16x16Symmetric partitions

4x4 to 64x64 Asymmetric partitions

Intra Prediction 9 Modes 35 Modes

Motion Prediction Spatial Median Advanced Motion Vection Prediction (Spatial + Temporal)

Luma motioncompensation

6 taps for half-pel positions+ Bilinear filter for qpel positions

8 taps for half-pixel positions + 7 tap filter for quarter-pel positions

Chroma motioncompensation

2 taps 4 taps

Slices Slices for parallel parsing Wavefront parallel processingTiles and slices for parallel parsing

In-loop filters Deblocking Deblocking and SAO

Page 9: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

HEVC compressionB

itR

ate

1990 2000 2010

MPEG-2

H264/AVC

H265/HEVC

35% reduction in bitrate for same PSNR output when compared to H.264

Perceptual video quality is subjective and cannot be measured with PSNR values

Subjective tests have shown around 50% reduction in bitrate for similar perceptual video quality when compared to H.264

About 50% compression over H264 for video resolutions of 1080p and above. 30-40% compression over H264 for lower resolutions

Page 10: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

HEVC Applications – Near Term

Over-the-top(OTT) video services market is growing at a rapid pace, thanks to Netflix, Hulu, YouTube etc.,

Smarter Phones and Tablets contribute significantly to OTT growth with consumers opting to view videos on-the-go

OTT video services are popularly used with in TVs/set-top boxes as well

Rapid growth in OTT market chokes the network bandwidth

One in five Consumers abandon viewing due to slow feeds , poor quality viewing experience

HEVC will enable superior viewing experience with OTT video service

Page 11: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

HEVC Applications – Long Term

Higher quality video in the traditional terrestrial and satellite broadcasts

Video recording in cameras and mobile phones, for saving storage space or higher quality

Broadcasting 1080p video at 50 or 60 frames per second for the same bandwidth as 1080i (25 or 30 fps)

4K and 8K Ultra-HD broadcasts for theatre-like quality

Page 12: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

Need for Software HEVC Decoder

HEVC is a newly ratified standard and there is no hardware support in the current generation of Processors (Embedded / Mobile / Applications SoCs)

Dedicated HW accelerators for HEVC increases the silicon area and hence the cost significantly

Lack of HEVC content makes the early HW implementation risky

Software Decoding is simpler and economically viable option for HEVC deployment NOW

Handling the HEVC decoder complexity on a wide range of processors with constraints on the power consumption is key challenge for the Software Decoder

Page 13: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

Why use GPUs for Video Processing ?

Decoding of high resolution videos in software involves high computational complexity and will load the CPU enormously

GPUs are highly compute capable and power efficient devices

GPUs are generally idle during video playout

GPU acceleration will free up the CPU to perform other (system) tasks

Sin

CPU Core(s)

ARM Cortex with NEON

MALI T600 / OpenCLcompliant GPU

Page 14: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

HEVC Decoding on Capable GPUs

GPUs are massively multithreaded devices capable of handling hundreds or thousands of threads in parallel at any given time

Only highly data parallel algorithms of video codec can be efficiently offloaded to the GPU for processing

Parsing & Entropy Decode

Motion Compensa

tion

Intra Prediction

Recon

Inverse Quant

Inverse Transform

Not suitable for GPU execution Data parallel execution ,suitable for GPU execution

Deblocking& SAO

Page 15: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

Sin

Motion Compensation

The current picture/frame pixels is predicted from the reference frame’s pixels

The reference picture can be from past or future

The prediction happens on a block-by-block basis

And there can be multiple reference frames for each block

Page 16: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

Sin

Motion Compensation

The most compute intensive part of Motion compensation is sub-pixel interpolation

• Luma – 8 or 7 tap filter

• Chroma – 4 tap filter

Sub pixel interpolation is data parallel, i.e., interpolation of each block within a frame can happen in parallel and hence suited for GPU computing

Page 17: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

Sin

Inverse Quantization and Transform

• The residue value need to be Inverse quantized

• 2-D Inverse DCT transformations should be performed over the inverse quantized data

Inverse Quantization & Transform

• Reconstruction : The output from the Motion compensation and intra prediction should be added with the output from Inverse transform

• In loop filtering such as Deblocking and SAO filters are applied over reconstructed samples

Recon & InLoop Filters

Parsing & Entropy Decode

Motion Compensati

on

Intra Prediction

Recon

Inverse Quant

Inverse Transform

Deblocking & SAO

Page 18: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

Challenges in CPU+GPU Implementation

• The effective FPS of decoder will be the minimum of the FPS achieved by the CPU and GPU for their respective work

• So the partitioning needs to be efficient so that both of them perform their respective work at almost the same speed(FPS)

Efficient Partitioning of work between

CPU and GPU

• The algorithms running on CPU will depend on the output of algorithms from GPU and/or vice versa

• A good design should make sure neither the CPU nor the GPU spend any time waiting for the output of the other

Efficient pipelining data between CPU

and GPU

• Cache coherency between CPU and GPU data need to ensured. Cache coherency

Page 19: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

Benefits of Mali T600 GPU

The 128-bit vector processing

• Suits DSP algorithms like Video processing

Presence of GPU cache instead of Local

memory

• No requirement for data transfers from/to global memory. Can be understood just like a CPU.

Flexible OpenCL workgroup size

• Works optimizally for a large range of OpenCL workgroup sizes. Multiple block sizes in a Video frame can be handled efficiently.

No divergent threads • Similar to CPU code, conditional code can be used in OpenCL

kernels as well. Different kinds of filter types, filter lengths etc., in video decode can be handled efficiently.

Unified memory • CPU and GPU share the same memory. Video YUV buffers are

pretty big. There is no need of costly memory transfers of those buffers.

MALI GPUs are well suited for Video Acceleration with significant power/performance benefits

Page 20: GPU accelerated HEVC decoder on Mali™ T600read.pudn.com/downloads600/ebook/2451717/ARM... · HEVC / H.265 Sandard HEVC aka H.265 is a video compression standard, jointly developed

Thank You

For more information visit www.ittiam.comor contact us at [email protected]