Intel Tools for High Performance Parallel...

32
Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Intel® Software Development Products for High Performance Computing and Parallel Programming Multicore development tools with extensions to many-core

Transcript of Intel Tools for High Performance Parallel...

Page 1: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® Software Development Products for High Performance Computing

and Parallel Programming

Multicore development tools with extensions to many-core

Page 2: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Notices

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. This document contains information on products in the design phase of development. All products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s current plan of record product roadmaps.

Intel, VTune, Cilk, Xeon and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries.

*Other names and brands may be claimed as the property of others

Copyright© 2012 Intel Corporation. All rights reserved.

Page 3: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Table of Contents

•Intel in HPC

•Intel HPC Software Development Products

•High Performance Parallel Programming with Intel’s architectures

•Features and benefits

– Investment protection

– Better performance & efficiency

• Call to Action and Summary

Page 4: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel in High-Performance Computing

A long term commitment to the HPC market segment

Large Scale Clusters

for Test & Optimization

Tera- Scale

Research

Leading Performance,

Energy Efficient

Platform Building Blocks

Dedicated, Renowned Applications

Expertise

Broad Software Tools

Portfolio

Defined HPC

Application Platform

Many Integrated

Core Architecture

Manufacturing Process

Technologies

Exa-Scale Labs

Page 5: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel Technology is Changing HPC Performance, Energy Efficiency, Reliability, TCO

SOLID STATE DISK

Optimize Performance for

I/O Intensive Apps and

Boot Drive Replacement

10GbE

Bridging the Gap Between

1GbE and InfiniBand*,

with RDMA, Unified Networking

PROCESSORS

Scalable Performance and

Energy Efficiency,

Multi- and Many-Core

A platform approach to high performance

MIC Xeon®

Page 6: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

CORE CORE CORE CORE

CORE CORE CORE CORE

Message Passing between/inside

Nodes Multi-Threading

within each (SMP) Node

M

I/O

P P

M

I/O

M

I/O

M

I/O

Interconnect

P P P P P P

. . .

. . .

e.g. e.g.

CO-PROCESSOR

M

Vectorization (SIMD) within each Core

The Majority of all HPC-Systems are

Clusters (Source: IDC)

Page 7: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Software Development and System

Environment

(die sizes not to scale)

Intel® Xeon® Processor

Intel® Many Integrated Core Architecture

Linux*

Established HPC Operating System

Same Comprehensive Set of SW Tools: Application Source Code Builds with a Compiler Switch

Page 8: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Scaling Performance Forward Software Tools Vision

Employ versatile and

common development tools

across all IA architectures

Single Portable

Software Stack

Flexible

Programmability

Scalable Performance

Data-Parallelism

Thread-Parallelism

Messaging

. . .

Processor

Page 9: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

High Performance Parallel Programming

Features and Benefits: Details

Page 10: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Develop & Parallelize

Today for Maximum Performance

Use One Software Architecture Today. Scale Forward Tomorrow.

Cluster

Multicore Cluster

Enabling & Advancing Parallelism High Performance Parallel Programming

10

Code

Compiler Libraries

Parallel Models

Multicore

& Many -core Cluster

Many-core

Multicore CPU Intel®

MIC Architecture Co-processor

Multicore

Multicore CPU

Page 11: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

More cores. Wider vectors. Co-Processors. Tools need to access all three dimensions to deliver performance

Images do not reflect actual die sizes

Intel® Xeon® processor

64-bit

Intel® Xeon® processor

5100 series

Intel® Xeon® processor

5500 series

Intel® Xeon® processor

5600 series

Intel® Xeon® processor

code-named

Sandy Bridge

Intel® Xeon® processor

code-named

Ivy Bridge

Intel® Xeon® processor

code-named

Haswell

Intel® MIC co-processor

code-named

Knights Ferry

Intel® MIC co-processor

code-named

Knights Corner

Core(s) 1 2 4 6 8 32 >50

Threads 2 2 8 12 16 128 >200

SIMD Width

128 128 128 128 256 256 256 512 512

SSE2 SSSE3 SSE4.2 SSE4.2 AVX AVX AVX2 FMA3

Software challenge: Develop scalable software

Page 12: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

High Performance Software Products Supporting Multicore and Many-core Development

Intel® Cluster Studio XE* Distributed Performance

Intel® Parallel Studio XE* Advanced Performance

Intel® Trace Analyzer and Collector

Intel® MPI Library

Intel® Inspector XE, Intel® VTune™ Amplifier XE, Intel® Advisor

Intel® C/C++ and Fortran Compilers w/OpenMP

Intel® MKL, Intel® Cilk Plus, Intel® TBB Library, Intel® ArBB Library Intel® IPP Library

Intel® Parallel Studio XE

Performance. Scale Forward. Proven

Page 13: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Invest in Common Tools and

Programming Models

Intel® Xeon® processors are

designed for intelligent

performance and smart

energy efficiency

Continuing to advance Intel®

Xeon® processor family and

instruction set (e.g., Intel® AVX,

etc.)

Multicore

Intel® MIC Architecture - co-

processors are ideal for highly

parallel computing applications

Software development

platforms ramping now

+

Many-core

Tomorrow

Use One Software Architecture Today. Scale Forward Tomorrow.

Code

Today

Use One Software

Architecture

Page 14: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Optimized Intel Libraries

Intel® MKL

Math Kernel Library • Science, Engineering and Financial applications

oriented

• Incl. BLAS, LAPACK, ScaLAPACK, Sparse Solvers, Fast

Fourier Transforms, Vector Math

Intel® IPP

Integrated Performance Primitives • Multimedia, Data Processing, and Communications

applications oriented

• Cryptography and String Processing

Page 15: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

void foo() /* Intel® Math Kernel Library */ {

float *A, *B, *C; /* Matrices */

sgemm(&transa, &transb, &N, &N, &N, &alpha, A, &N, B, &N, &beta, C, &N);

}

Go Parallel with High Performance Math Kernel

Library Intel® Math Kernel Library (Intel® MKL)

Intel® Xeon® processor Intel® MIC co-processor

Implicit automatic offloading requires no code

changes, simply link with the offload MKL Library

Page 16: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Go Parallel with Intel® Cilk™ Plus

16

• Proven Cilk parallel model, teachable in one minute

– Parallelism in Three Key Words: • cilk_spawn • cilk_sync • cilk_for

• Cilk™ Plus: an open specification

– Recently placed into open source by Intel for the advancement of parallel programming

Learn more at http://cilkplus.org

// Parallel function invocation, in C

cilk_for (int i=0; i<n; ++i){ Foo(a[i]); }

// Parallel spawn in a recursive fibonacci // computation, in C int fib (int n) { if (n < 2) return 1; else { int x, y; x = cilk_spawn fib(n-1); y = fib(n-2); cilk_sync; return x + y; } }

Page 17: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

//pragma SIMD: User-mandated // vectorization

#pragma simd for (i=0; i<n; i++) { A[i] = A[i]+ B[i] + C[i]; }

// Simplify operation using // array notations in C/C++:

a[:] = b[:] + c[:];

// Elemental functions, in C, // using Cilk Plus: __declspec (vector) void saxpy(float a, float x, float &y) { y += a * x; }

Go Parallel with Intel® Cilk™ Plus

• Data and Task Parallelism as first class citizens in C and C++

– vectorization via intuitive notations that automatically span MMX, SSE, AVX, and wider widths in the future including those in MIC co-processors • array notations

• #pragma SIMD controls

• elemental functions

Learn more at http://cilkplus.org

Page 18: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Go Parallel with Intel® Threading Building

Blocks (Intel® TBB)

• A popular parallel abstraction for C++ developers

– A C++ template library

– Scalable memory allocation

– Load-balancing

– Work-stealing task scheduling

– Thread-safe pipeline

– Concurrent containers

– High-level parallel algorithms

– Numerous synchronization primitives

• Intel remains a leading participant and contributor in the TBB open source project as well as a leading supplier of TBB support and supporting tools

//Parallel function invocation example, in C++,

//using TBB:

parallel_for (0, n, [=](int i) {

Foo(a[i]);

});

Learn more at http://threadingbuildingblocks.org

Page 19: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Go Parallel with Message Passing Interface Intel® Message Passing Interface (Intel® MPI)

• Extend your cluster solutions to

the Intel® MIC Architecture

– E.g., Intel MIC in every node of the cluster

using Intel® MPI and Intel® Parallel Building

Blocks on nodes

– Same model as an Intel® Xeon processor

based cluster

• Intel is a leading vendor of MPI

implementations and tools

19

Learn more at http://intel.com/go/mpi

Clusters with Multicore and Many-core

… …

Multicore Cluster

Clusters

Page 20: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Go Parallel with Coarray Fortran Intel® Fortran Compiler

• A standard, explicit notation for data decomposition, such as that often used in message-passing models, expressed in a natural Fortran-like syntax.

• For parallel programming on both shared memory and distributed memory systems

20

!Sum in Fortran, using co-array

feature:

REAL SUM[*]

CALL SYNC_ALL( WAIT=1 )

DO IMG= 2,NUM_IMAGES()

IF (IMG==THIS_IMAGE()) THEN

SUM = SUM + SUM[IMG-1]

ENDIF

CALL SYNC_ALL( WAIT=IMG )

ENDDO

Learn more at http://intel.com/software/products

Page 21: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Go Parallel with OpenMP* Intel® C/C++ and Fortran Compilers

• A flexible interface for developing parallel applications

– An abstraction for multi-threaded solutions

• OpenMP* is a standard used by many parallel applications

– Supported by every major compiler for Fortran, C, and C++

21

//C/C++ OpenMP* Pragma !Fortran OpenMP*

#pragma omp parallel for reduction(+:pi)

for (i=0; i<count; i++) {

float t = (float)((i+0.5)/count);

pi += 4.0/(1.0+t*t);

}

pi /= count;

!$omp parallel do

do i=1,10

A(i) = B(i) * C(i)

enddo

!$omp end parallel

Learn more at http://openmp.org

Page 22: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® Xeon® processor Intel® MIC co-processor

main() { double pi = 0.0f; long i;

for (i=0; i<N; i++)

{ double t = (double)((i+0.5)/N); pi += 4.0/(1.0+t*t); } printf("pi = %f\n",pi/N); }

Go Parallel with OpenMP* Intel® C/C++ and Fortran Compilers

#pragma omp parallel for reduction(+:pi) #pragma offload target (mic)

One Line Change to Offload to MIC Co-Processor

Page 23: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Go Parallel with C/C++ Language Extensions

• Simple Keyword

Language

Extensions to control

offloading to MIC co-

processor

23

C/C++ Language Extensions

class _Shared common {

int data1;

char *data2;

class common *next;

void process();

};

_Shared class common obj1, obj2;

… _Cilk_spawn _Offload obj1.process();

_Cilk_spawn obj2.process();

Page 24: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Use the Same Code for Execution on

Intel® MIC Architecture by Offloading

24

C/C++ Offload Pragma

#pragma offload target (mic)

#pragma omp parallel for reduction(+:pi)

for (i=0; i<count; i++) {

float t = (float)((i+0.5)/count);

pi += 4.0/(1.0+t*t);

}

pi /= count;

MKL Implicit Offload

//MKL implicit offload requires no source code changes, simply link with the offload MKL Library.

MKL Explicit Offload

#pragma offload target (mic) \

in(transa, transb, N, alpha, beta) \

in(A:length(matrix_elements)) \

in(B:length(matrix_elements)) \

in(C:length(matrix_elements)) \

out(C:length(matrix_elements)alloc_if(0))

sgemm(&transa, &transb, &N, &N, &N, &alpha,

A, &N, B, &N, &beta, C, &N);

Fortran Offload Directive

!dir$ omp offload target(mic)

!$omp parallel do

do i=1,10

A(i) = B(i) * C(i)

enddo

!$omp end parallel

C/C++ Language Extensions

class _Shared common {

int data1;

char *data2;

class common *next;

void process();

};

_Shared class common obj1, obj2;

_Cilk_spawn _Offload obj1.process();

_Cilk_spawn obj2.process();

Page 25: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Parallelism with OpenCL* Intel® OpenCL SDK • OpenCL* is a framework for writing programs that execute across

heterogeneous platforms (e.g., CPUs, GPUs, many-core)

• Intel is a leading participant in the OpenCL* standard efforts, and a vendor of solutions and related tools with early implementations available today.

• OpenCL* addresses the needs of customers in specific segments

25

//Simple per element multiplication using OpenCL*:

kernel void dotprod( global const float *a,

global const float *b,

global float *c)

{

int myid = get_global_id(0);

c[myid] = a[myid] * b[myid];

}

Learn more at http://intel.com/go/opencl

Page 26: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Host Program

Host Offload Library

Message Library

Target Program

Target Offload Library

Message Library

Running your Application Execution on the host and Intel® MIC Co-processor(s)

26

Without: Intel® MIC Co-processor(s) are absent

With: Intel® MIC Co-processor(s) are present

Application starts and executes on host

Application starts on host and executes portions on Intel MIC Co-processor(s)

At runtime, if Intel® MIC Co-processor(s) are available, the target binary is loaded

At each offload, the construct runs on host cores/threads

At each offload, the construct runs on the Intel MIC® Co-processor(s)

Normal program termination on host

At program termination, target binary is unloaded

Page 27: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Using the Intel® Debugger Overview

27

• Debugging of host and target simultaneously

• If host application is being debugged, target application is also debugged automatically

• Debugger runs on host for both host and target program

• Debugger halts and resumes both host and target program synchronously

• Full C, C++ and Fortran support on both sides

• Future: debugger presents view of one virtual application inside a single GUI

• Extensible to cover more than one offload card

Intel® Debugger

Host Program

Target Program Target

Program

Intel® Debug Server

Intel® Debug Server

Page 28: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Analyzing your Application Performance Analysis Tools

• Intel® VTune™ Amplifier XE performance profiler

– Analyze your multicore and many-core performance

• Analyze performance of the application in offload mode

• Support for Intel® MIC Co-processors includes:

– A Linux* hosted command line tool that collects events

– The VTune™ Amplifier XE graphical user interface to display results collected in previous step highlighting bottlenecks, time spent and other details of performance.

28

Page 29: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Preserve Your Development Investment Common Tools and Programming Models for Parallelism

29

Multicore

Many-core

Heterogeneous

Computing

Intel® Cilk Plus

Intel® TBB Offload Pragmas

OpenCL*

OpenMP*

OpenMP*

Coarray

Offload Directives

Intel® MPI

Intel® MKL

C/C++

Fortran

Intel® C/C++ Compiler

Intel® Fortran Compiler

Develop Using Parallel Models that Support Heterogeneous Computing

Page 30: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Call to Action

• Evaluate the Intel® Software Development Products, including the family of Parallel Programming Models, for your High Performance needs:

http://www.intel.com/software/products/eval

• For product information see:

http://www.intel.com/software/products

30

Page 31: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division

Copyright© 2010, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 31

Page 32: Intel Tools for High Performance Parallel Programmingcommunity.hartree.stfc.ac.uk/access/content/group... · Performance, Energy Efficiency, Reliability, TCO SOLID STATE DISK Optimize

Software & Services Group, Developer Products Division Copyright© 2012 Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Optimization Notice

32