Altera SDK for OpenCL - · PDF file Altera SDK for OpenCL A novel SDK that opens up the world...

Click here to load reader

  • date post

    22-May-2020
  • Category

    Documents

  • view

    4
  • download

    0

Embed Size (px)

Transcript of Altera SDK for OpenCL - · PDF file Altera SDK for OpenCL A novel SDK that opens up the world...

  • © 2012 Altera Corporation—Public

    Altera SDK for OpenCL

    A novel SDK that opens up the world

    of FPGAs to today’s developers

    Altera Technology Roadshow 2013

  • © 2012 Altera Corporation—Public

    Today’s News

     Altera today announces its SDK for OpenCL

     OpenCL allows software developers to boost system performance by using an FPGA’s massively parallel architecture

     Increases designer productivity by raising the level of design abstraction

    2

    Altera working

    with Proof of

    Concept

    Customers

    Altera Opens

    OpenCL Early

    Access Program

    Altera announces

    SDK for OpenCL

    2010 2011

    Altera Joins

    Khronos Group

    2012 2012

  • © 2012 Altera Corporation—Public

    Performance Challenges

    CPU CPUs

    Multiple CoresMultiple Cores 100s of Cores100s of Cores Single CoreSingle Core

    Performance Challenge

    3

    Performance Wanted

    Multimedia HD Video Processing

     Image processing

    High-Performance Computing Financial Modeling

    Big data analytics

    Scientific computing

    Military Radar image processing

    Persistent surveillance

    Medical Medical imaging

    Bio informatics

    FPGAsFPGAs

    GeneralGeneral-- PurposePurpose

    GPUsGPUs

    DSPsDSPs

  • © 2012 Altera Corporation—Public

    Modern Altera FPGA: Massively Parallel

     >1M logic elements

     >3.9 billion transistors

     >50 Mb of integrated memory

     Variable precision floating-point DSP blocks

     High-speed serial transceivers

    4

  • © 2012 Altera Corporation—Public

    Altera Is Driving Silicon Convergence

    5

    General Processors

     Software programmable

     Great flexibility

     Poor power efficiency

    Application-Specific

     Not programmable, hard wired

     Inflexible

     Great power efficiency

     Many contain embedded

    processors

    FPGAs

     Hardware and software

    programmable

     Great flexibility

     Good power efficiency

    = Microprocessor + DSP + Application-Specific IP + Programmable Fabric

    µP DSP

    Need for Efficiency »

    ASIC ASSP

    « Need for Flexibility FPGA Combines the Best of All Four + FPGA

    FPGA Combines the Best of All Four + FPGA

    FPGAFPGA

    µP ASIC

    DSP ASSP

  • © 2012 Altera Corporation—Public

    Improving FPGA Development Productivity

    6

    D e

    v e

    lo p

    m e n

    t P

    ro d

    u c ti

    v it

    y

    System Performance Low High

    High

    Multi-core CPU

    GPU + CPU

    FPGA + CPU

    Standard CPU

    Custom ASIC

    Need to design at higher levels

  • © 2012 Altera Corporation—Public

    7

    OpenCL for Heterogeneous Solutions

     C-based language with extensions:  Standard C Language

     Altera OpenCL C extensions (adds parallelism to C)

     API (Open standard for different devices)

     Programming model supports parallelism

    in heterogeneous systems  CPU/GPU/FPGA

    main(…)

    {

    }

    main(…)

    {

    for( … )

    {

    }

    main(…)

    {

    }

    main(…)

    {

    for( … )

    {

    }

    main(…)

    {

    }

    main(…)

    {

    for( … )

    {

    }

    ProcessorProcessor

    Hardware

    Accelerator

    OpenCL

    Host Program + Kernels

    Compiler

    http://en.wikipedia.org/wiki/File:OpenCL_Logo.png

  • © 2012 Altera Corporation—Public

    Introducing Altera SDK for OpenCL

    8

    Binary Programming

    File

    Binary Programming

    File

    SDK for OpenCL SDK for OpenCL

    Standard C

    Compiler

    Standard C

    Compiler

    Executable File

    Executable File x86 x86

    OpenCL

    Host Program + Kernels

    OpenCL

    Host Program + Kernels

    http://en.wikipedia.org/wiki/File:OpenCL_Logo.png

  • © 2012 Altera Corporation—Public

    OpenCL Implementation on Altera FPGAs

    9

    O F F C H I P I N T E R C O N N E C T

    O N C H I P M E M O R Y I N T E R C O N N E C T

    Memory

    Blocks

    A rb

    itratio n

    N etw

    o rk

    HostHost

    ProcessorProcessor

    Accelerator

    /

    Datapath

    Computation

    Accelerator

    /

    Datapath

    Computation

    Accelerator

    /

    Datapath

    Computation

    Host

    Program

    Altera FPGA

    Standard C

    Compiler

    Standard C

    Compiler

    OpenCL

    Host Program + Kernels

    OpenCL

    Host Program + Kernels

    SDK for OpenCL SDK for OpenCL

    Memory

    Blocks

    Memory

    Blocks

    Memory

    Blocks

    Memory

    Blocks

    Memory

    Blocks

    Memory

    Blocks

    Memory

    Blocks

    Placement & Routing Locked,

    Timing Closed

    PCIe Memory Interface

    Controller

    Memory Interface

    Controller

    External DDR Memory

    External DDR Memory

  • © 2012 Altera Corporation—Public

    Accelerating Performance with SoC FPGAs

    10

    Single-Chip

    OpenCL Solution:

     SoC = ARM + FPGA

    Integration Enables:

     Higher bandwidth and

    lower latency between

    FPGA and processor - >125Gbps Interconnect

     Processor integration

    reduces system cost

    FPGA

    Hard Processor System

    ARM Cortex-A9 ARM Cortex-A9

    FPGA FPGA

    Hard Processor System Hard Processor System

    ARM Cortex-A9 ARM Cortex-A9

    >125

    Gbps

  • © 2012 Altera Corporation—Public

    OpenCL Example #1 –

    Faster Time-to-Market

    Video Camera Requiring Intense Video Processing

     Proprietary video codec algorithms

     Captures frames with different

    exposure levels  retains highlight

    and shadow details

    Let Customer Implement Code In An FPGA In

  • © 2012 Altera Corporation—Public

    OpenCL Example #2 –

    Higher Performance

    Financial Marketplace Monte-Carlo Black Scholes Simulations

     Calculate the value of trading options w/ multiple sources of uncertainty

     FPGA delivers higher performance at a fraction of the power

    >9X Higher Performance vs CPU Alone

    8.0

    8.5

    9.0

    9.5

    10.0

    10.5

    11.0

    11.5

    12.0

    0 .0

    0

    0 .0

    5

    0 .1

    0

    0 .1

    5

    0 .2

    0

    0 .2

    5

    0 .3

    0

    0 .3

    5

    0 .4

    0

    0 .4

    5

    0 .5

    0

    0 .5

    5

    0 .6

    0

    0 .6

    5

    0 .7

    0

    0 .7

    5

    0 .8

    0

    0 .8

    5

    0 .9

    0

    0 .9

    5

    1 .0

    0

    S to

    c k P

    ri c e

    Time

    OpenCL

    MCBS

    Quad-core

    µP

    Comparable

    GPU

    Stratix IV

    FPGA

    Number of Cores 8 448 N/A

    Simulations per

    Second 240M 2100M 2,200M

    Power (Watts) 130W 215W 21W

    12

  • © 2012 Altera Corporation—Public

    OpenCL Example #3 –

    Power Efficiency

    Documenting Search / Filtering Algorithm

     Review incoming stream (documents) and return best match

     E.g. Monitors news feeds and recommends others

     Power savings = Cost savings (huge issue for server farms)

    >5X Performance/Watt vs. GPU

    Platform Quad-core

    µP

    Comparable

    GPU

    Stratix IV

    FPGA

    Number of Cores 6 448 N/A

    Performance /Watt (MT/J) 15.9 15.1 83.6

    13

  • © 2012 Altera Corporation—Public

    Benefits of Altera OpenCL for FPGA

    14

     Superior Design Productivity

     Quick and easy evaluation of different solutions

     Fast development / debug / optimization cycles

     Faster time-to-market

     Higher Performance

     >9X greater performance vs CPU alone running a Monte-Carlo Black Scholes simulation

     Improved Power Savings

     >5X performance/Watt vs GPU-based heterogeneous systems running a document search algorithm

     Greater Portability

     Reuse across multiple platforms, multiple generations

    Binary Programming

    File

    Binary Programming

    File

    SDK for OpenCL SDK for OpenCL

    Standard C

    Compiler