Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 -...

38
Jeff McAllister Intel Senior Technical Consulting Engineer Webinar – Oct. 27, 2016

Transcript of Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 -...

Page 1: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Jeff McAllisterIntel Senior Technical Consulting EngineerWebinar – Oct. 27, 2016

Page 2: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice2

Welcome! - About our Speaker

Jeff McAllisterMedia & OpenCL Senior Software Technical Consulting Engineer

Developer Products Division

Intel Software & Services Group

Page 3: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice3

Welcome! What We’ll Cover Today

Intel Hardware

Intel® Software Tools &

SDKs

Awesome Video

Processing Solutions

Page 4: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

The Other Side of the Chip

See Technical Specifications for System Requirements - Select SKUs of Intel® Xeon® & Core™ processor-based platforms apply.

Page 5: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice5

Why Develop Now for 6th Generation Intel® Xeon® & Intel® Core™ Processors?

Take advantage of Intel CPUs, integrated graphics (GPUs) & hardware-accelerated MPEG-2, AVC, & now NEW HEVC codecs

to deliver fast, high-density, real-time video transcoding.

Stay competitive – Transition to 4K - Innovate cloud video, OTT streaming, immersive experiences & more.

Page 6: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice6

Intel Hardware is Heterogeneous

CPUs Awesome general purpose performance

Large software ecosystem

Other Programmable Intel Hardware GPU (shown here)

IPU

FPGA

See Technical Specifications for System Requirements - Select SKUs of Intel® Xeon® & Core™ processor-based platforms apply.

Page 7: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice7

Media Capabilities

Gen9 Processor Graphics GPU 14nm process technology

Integrated with processor

Higher Performance- GT2 with 24 execution units- GT4e* with 72 EUs &128MB eDRAM- CPU+GPU provide over 1 TFLOPS processing power

Latest API feature support- DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1- Tightly coupled CPU/GPU programming using Shared Virtual

memory + OpenCL

Expanded hardware acceleration for media features- Low power/full fixed function AVC encode- HEVC Encode/Decode- MJPEG Encode

See Technical Specifications for System Requirements - Select SKUs of Intel® Xeon® & Core™ processor-based platforms apply.

Page 8: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice8

Video Transcoding Performance: HEVC

Intel and the Intel logo are trademarks of Intel Corporation.

NEW! Up to 2 Real-time HEVC streams per Intel® Xeon® processor1

115 real-time HD AVC-HEVC or 4 realtime UHD AVC-HEVC transcode , 8 real-time HD HEVC-HEVC or 2 realtime UHD HEVC-HEVC transcode using Intel MediaSDK (Target usage 7), all content 8-bit 4:2:0. - Benchmark platform configuration: Processor: Intel® Xeon® processor E3-1585Lv5 @ 3.0GHz, Ring @ 3.0GHz and GT @1.15GHz; primary BIOS Version: SKLSE2R1.R00.B104.B01.1511110114; driver: 20.19.15.4444. platform: RVP11 halo fab 2; OS: Windows* 8.1x64 Enterprise, 16 GB memory, 2 DIMMS 2133 MHz, one socket, four cores, Intel®Iris™ Pro Graphics P580, Intel® Hyper-threading Technology enabled, Intel® Virtualization technology enabled.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit www.intel.com/performance.

Multistream Performance (1xRT=30fps)

Number of Real-time (30fps) streams

Number of Real-time (60fps) streams

1080p-to-1080p

AVC-to-HEVC 15 7HEVC-to-HEVC 8 4

4K-to-4K

AVC-to-HEVC 4 2HEVC-to-HEVC 2 1

E3-1500 v5 HEVC is fully accelerated targeting 4K60 capability

Specific hardware technical specifications apply. See performance benchmarks and Media Server Studio site for details.

Page 9: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice9

Graphics Technology HighlightsGlossary

Execution Units (EUs) = general purpose cores

EUs, samplers, caches, etc. in “slices”

Fixed function is in “unslice”

eDRAM adds cache, increases bandwidth

adds Other names Summary

Intel® HD Graphics

GT2“4+2”

Good

Intel® Iris™ Graphics

+slices+eDRAM

GT3“2+3e”

Better

Intel® Iris™ Pro Graphics

+slices+eDRAM

GT3e,GT4e“4+4e”

Best

Naming Convention

Just look for Intel® QuickSync Video at ark.intel.com

Fixed Function (VDBox, VEBox)

Page 10: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice10

Intel Processor Graphics/GPU Overview

GT2Intel® HD Graphics24 EUs, 1 MFX

GT3Intel® Iris™ Graphics48 EUs, 2 MFX

GT4Intel® Iris™ Pro Graphics72 EUs, 2 MFX

Page 11: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice11

Codecs + Frame Processing use Fixed Function + EUs

EU EU

EU EU

EU EU

EU EU

Sampler

EU EU

EU EU

EU EU

EU EU

3D

FFMedia Fixed Function

VDBOX VEBOX

EU EU

EU EU

EU EU

EU EU

VPP

Video Decoding BSD=VDBox decode

Caches

Video EncodingENC= EU+VDBox VME (MB type, motion vectors, bit budget/BRC)PAK = VDBox (residue packing & entropy coding)VDENC = low power encode (6th Generation Core® & forward)

VPHalVideo Processing Hardware

Acceleration Layer

VEBox

• Deinterlacing

• Denoise (Luma/Chroma)

• Frame Rate Conversion

• Color space conversions

• Composition/alpha blending

• ScalingSampler Sampler

Page 12: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Optimizing Media Solutions & Applications with the Intel® Media SDK & Intel® SDK for OpenCL™ Applications

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.

Page 13: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice13

Better Together

Heterogeneous Toolsaccess the other side of the chip

Intel® Media Server Studio

Intel® Media SDK**

Intel® SDK for OpenCL™ Applications**

He

tero

ge

ne

ou

s H

ard

wa

re

**Also available as standalone tools.

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.

Page 14: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Intel® Media SDK / Intel® Media Server StudioM

ain

lo

op

Init

iali

za

tio

n

Decode

init

Parameters (from header)

stream frame VPP frame Encode stream

init

Parameters (in & out)

init

Parameters

Media accelerator frameworkCodec basedHigh level/parameter interface 3 operations

Good option for: Accelerated video encode, decode (and short list of frame processing)

Links to More Information Media Server Studio Media SDK Intel Media Code Samples

14

Page 15: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice15

Intel® Media SDK 2017 Supported Codecs

Standard Encode Decode

HEVC (main profile) HW HW

AVC SW/HW/ low power SW/HW

MPEG-2 SW/HW SW/HW

MJPEG SW/ HW SW/ HW

MVC SW/HW SW/HW

VC-1 - SW/HW

green=new in Intel® Media Server Studio for Gen9

Page 16: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

N:1 Frame Composition

Resizing

Color Conversion

Deinterlacing

Denoising

Frame Rate Conversion

Brightness/Contrast/Saturation

Sharpening

16

Intel® Media SDK 2017 Supported Video Processing Features

Page 17: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice17

Media Software Scope Diagram

Transcode pipeline

Intel Media SDK/Intel® Media Server Studio focus

Limited support

Out of scope/external component

Intel® Media SDK Audio (AAC enc/dec, MPEG dec)

Intel® Media SDK (Video)

Decode Encode

Demuxer/Splitter

Process

Muxer

ES

ES ES

ES

Decode Process Encode

ES = Elementary stream

Container file input

Container file output

Page 18: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice18

Containers & Timestamps Presentation Time Stamp (PTS): Time when unit should be presented (viewed/heard) Decoding Time Stamp (DTS): Secondary – required when decode + buffer must happen before presentation

due to complex reference structure (i.e. modern video formats)

Container

Audio

•Packets

•Timestamps

Video

•Packets

•DTS, PTS Demux

Decode Frame Proc Encode

container fmt

bitstream

elementary video

bitstreamSurface Surface

elementary video

bitstream

video

audio

mux

mfxBitstream• DecodeTimeStamp (DTS)• TimeStamp(PTS)

mfxFrameSurface1->mfxFrameData:• TimeStamp(PTS)

Takeaways Intel® Media SDK forwards timestamps (PTS) through the pipeline Except for FRC and deinterlace VPP, Media SDK does not touch timestamps New DTS added to output bitstream based on encoder GOP settings

Page 19: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice19

Memory Usage – Memory Surfaces

Two surface types– System surfaces– Video surfaces

Video is tiled & can’t be efficiently accessed by CPU

Every Media SDK component supports both memory types, internal copy is used if necessary

Internal copy may lead to HUGE performance degradation

Memory

system

CPU

app SW

GPU

HW

video

Page 20: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice20

Opaque Memory

Software/Hardware

Allocate DirectX*

Surfaces

Allocate System

Memory Buffers

DECODE/VPP/ENCODE Initialization

Hardware

Software

DECODE/VPP/ENCODE

Initialization

Allocate frames with NULL buffer pointers

Before After

Allocator Callbacks

Problem: Different allocation pathways for software & hardware implementations increases complexity & code maintenance

Solution: Let Intel® Media SDK allocate surfaces & handle them internally

Page 21: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice21

Basic Structure of an Intel® Media SDK-optimized Application

Ap

pli

cati

on

Initialize Session, set parameters

Query + Allocate

Main loop

Find free surface

Q stages: decode, VPP Encode

Sync

Retrieve output

Drain loop Same as above

Clean up, exit

Loop

Page 22: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice22

Intel® Media SDK “Hello world”

Page 23: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice23

All Stages are Initialized with mfxVideoParam

typedef struct {mfxU32 AllocId;mfxU32 reserved[2];mfxU16 reserved3;mfxU16 AsyncDepth;

union {mfxInfoMFX mfx;mfxInfoVPP vpp;

};mfxU16 Protected;mfxU16 IOPattern;mfxExtBuffer** ExtParam;mfxU16 NumExtParam;mfxU16 reserved2;

} mfxVideoParam;

mfxInfoMFX (decode/encode) Codec, profile/level Decode: mostly read from stream header Encode: params covered in encode section

(in mfxstructures.h)

mfxInfoVPP In/Out frame parameters (covered in VPP section)

Extended Parameter sets (next slide)

Page 24: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice24

Extended Parameter Sets

mfxExtCodingOption2 extendedCodingOptions2;

memset(&extendedCodingOptions2, 0, sizeof(extendedCodingOptions2));

extendedCodingOptions2.Header.BufferId = MFX_EXTBUFF_CODING_OPTION2;

extendedCodingOptions2.Header.BufferSz = sizeof(extendedCodingOptions2);

extendedCodingOptions2.BRefType=MFX_B_REF_PYRAMID;

mfxExtCodingOption3 extendedCodingOptions3;

memset(&extendedCodingOptions3, 0, sizeof(extendedCodingOptions3));

extendedCodingOptions3.Header.BufferId = MFX_EXTBUFF_CODING_OPTION3;

extendedCodingOptions3.Header.BufferSz = sizeof(extendedCodingOptions3);

extendedCodingOptions3.EnableMBQP=MFX_CODINGOPTION_ON;

mfxExtBuffer* extendedBuffers[2];

extendedBuffers[0] = (mfxExtBuffer*) & extendedCodingOptions2;

extendedBuffers[1] = (mfxExtBuffer*) & extendedCodingOptions3;

mfxEncParams.ExtParam = extendedBuffers;

mfxEncParams.NumExtParam = 2;

Problem: How to future proof AND allow the SDK to grow?

Solution: Design in the ability to extend while retaining mfxVideoParam original size

(Now most apps require EPS)

Common pattern configure ID and size in header 0=default/unused Supply vals for params used

Attach an array of param buffersTell Intel® Media SDK how big the array is

Page 25: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice25

Enabling Low Power h264 Encode

mfxEncParams.mfx.LowPower=MFX_CODINGOPTION_ON;

Page 26: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Plu

gin

s

Dri

ve

r

26

3 Flavors of HEVC

Hardware

Closest to Hardware

AVC performance

Fastest

Software

Best quality, most options

Slowest

Software+GPUAcceleration

Close to Software Quality

Boost for Software Performance

In Intel® Media Server Studio 2017 & Intel® Media SDK (Windows) when run

on supported hardware with Gen9/Skylake graphics

In Intel® Media Server Studio Professional Edition. More hardware options.

Page 27: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Encode Setup

//load HEVC plugin here

sts = Initialize(impl, ver, &session, NULL);

MFXVideoENCODE mfxENC(session);// Create Media SDK encoder

// Set required video parameters for encode...

// Query number of required surfaces for encoder

sts = mfxENC.QueryIOSurf(&mfxEncParams, &EncRequest);

sts = mfxENC.Init(&mfxEncParams); // Initialize the Media SDK encoder

// Main loop

27

Page 28: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Expected Return Codes for EncodeFrameAsyc

28

Basic Encode Flow

EncodeFrameAsyc(surface in)

MFX_ERR_MORE_DATA

EncodeFrameAsyc(null in)

Initialize

Finish (MFX_ERR_MORE_DATA indicates all surfaces drained)

Main loop Drain loop

Mo

re i

np

ut

Inp

ut

fin

ish

ed

MFX_ERR_MORE_DATA

•More input surface data is required to proceed. Encode may request several input surfaces before producing its first output.

MFX_WRN_DEVICE_BUSY

•Hardware device is unable to respond. This is an expected output for normal operation & should clear after a very short wait. However, if this state persists more than a few milliseconds this may indicate a problem.

MFX_ERR_NOT_ENOUGH_BUFFER

•Bitstream output buffer is not big enough to contain output frame. Output buffer size must be increased.

Other

•Other error codes may be bugs. Contact an Intel support representative for more information.

Page 29: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice29

Encode do { if (still_reading_file) { // main loop sts = mfxENC.EncodeFrameAsync(NULL, pEncSurfaces[nEncSurfIdx], &mfxBS, &syncp); } else { // drain loop sts = mfxENC.EncodeFrameAsync(NULL, NULL, &mfxBS, &syncp); if (sts==MFX_ERR_MORE_DATA) break; } switch(sts) { case MFX_WRN_DEVICE_BUSY: MSDK_SLEEP(1); break; case MFX_ERR_MORE_DATA: nEncSurfIdx = GetFreeSurfaceIndex(pEncSurfaces, nEncSurfNum); // Find free surface readsts=LoadRawFrame(pEncSurfaces[nEncSurfIdx], fSource); if (readsts!=MFX_ERR_NONE) still_reading_file=0; break; } if (sts!=MFX_ERR_NONE) continue; sts = session.SyncOperation(syncp, 60000); // Synchronize. // bitstream data can be used here } while (true);

Page 30: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice30

Application Design Fundamentals

Heuristic

Use video memory/NV12 color format

Avoid CPU<->GPU raw frame copies

Run asynchronously

Minimize waits for non-GPU tasks

Asynchronous: Each stage can have multiple frames “in flight,” Frames locked while a session is working on them.

Session/pipeline-based: Not accelerating individual operations

Based on video memory (NV12 color format, GPU allocated): Arrange pipelines to minimize conversion steps.

Designed to minimize copies: As with NV12 conversions, arrange pipeline steps to reuse surfaces in the same location instead of copying them between CPU and GPU.

Minimize waits: Enqueue as many operations as possible w/o blocking for CPU.

Intel® Media SDK (Video)

Decode EncodeProcess

Page 31: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Why OpenCL + Intel® Media SDK?

Decode EncodeProcess

What Intel® Media SDK Covers (highlevel)

OpenCL CoversFuller range/lower level

Media SDK provides optimized implementations for: Codecs Frame Processing Operations

For video processing tasks not in Media SDK’s scope, extend with OpenCL Make use of growing GPU capabilities Keep pipelines on GPU

Example uses: color conversions, custom bit rate control

Fixed Function

Performance

Add your Innovation via

GPGPU

Build Something Awesome!

31OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.

Page 32: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice32

Advanced Analysis can be Yours Use Intel® VTune™ Amplifier** to analyze Intel® Media SDK & OpenCL™ optimized Media Applications

GEN GPU engines

utilization

GPU hardware metrics over

time

Memory Bandwidth

CPU Software Threads

**Available in Intel® Media Server Studio or as a standalone tool.

Page 33: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

How to get the Intel® Media SDK

33

Intel® Media SDK - FREE

Platform / Device Targets Intel® Core™ or Core™ M processors

Select SKUs of Intel® Celeron™, Pentium™ & Atom™ processors with Intel® HD Graphics supporting Intel® Quick Sync Video

Client devices – Desktop/mobile applications

See Technical Specifications for System RequirementsSee Technical Specifications for System Requirements

Intel® Media Server Studio – 3 Editions (includes Free Community)

Platform / Device Targets Select SKUs of Intel® Xeon® & Core™ processor-based

platforms Applications for media, communications infrastructure,

video processing/conferencing, digital surveillance, video cloud & data center

For HEVC, AVC, MPEG-2, MPEG-Audio

Downloadsoftware.intel.com/media-sdk

Downloadsoftware.intel.com/intel-media-server-studio

Page 34: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice34

More Resources

• software.intel.com/media-sdkIntel® Media SDK

• software.intel.com/intel-media-server-studioIntel® Media Server Studio

• github.com/Intel-Media-SDK/samplesLearn from Samples

& Tutorials

• software.intel.com/forums/intel-media-sdkAsk questions at the forum

Webinar Replays

Page 35: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded
Page 36: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Legal Notices, Disclaimers & Optimization NoticeIntel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit http://www.intel.com/performance.

All information provided here is subject to change without notice. Contact your Intel representative, sales office or distributor to obtain the latest Intel product specifications and roadmaps.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

The cost reduction scenarios described in this document are intended to enable you to get a better understanding of how the purchase of a given Intel product, combined with a number of situation-specific variables, might affect your future cost and savings. Nothing in this document should be interpreted as either a promise of or contract for a given level of costs.

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.

Intel, the Intel logo, Xeon, Core, Iris Pro, and VTune are trademarks of Intel Corporation in the U.S. and other countries.OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.

36

Optimization Notice

Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804

Page 37: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded

Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice37

Intel® Media Server Studio Editions – At a Glance

Feature/Component Community Edition** Essentials Edition Professional Edition

Intel® Media SDK

Graphics Drivers

Code Samples

OpenCL™ Code Builder and Runtime

Metrics Monitor (Linux* only)

Intel® Premier Support

HEVC Decoder & Encoder, GPU Assist APIs

Audio Decoder & Encoder

Video Quality Caliper

Intel® VTune™ Amplifier

Premium Telecine Interlace Reverser

Premium Components

Page 38: Webinar Oct. 27, 2016 - Intel · - DirectX 3D 2015 version, OGL 4.4, OpenGL ES 3.0, OpenCL 2.1 - Tightly coupled CPU/GPU programming using Shared Virtual memory + OpenCL Expanded