S9881 Using Industry Standard Benchmark Tools to Size Graphics · Presentation Subtitle Mike...

53
Presentation Subtitle Mike Brennan, Product Manager, Virtual Client Computing and Graphics Vadim Lebedev, Technical Marketing Engineer 21 March 2019 S9881 Using Industry Standard Benchmark Tools to Size Graphics Accelerated Applications March 21, 2019

Transcript of S9881 Using Industry Standard Benchmark Tools to Size Graphics · Presentation Subtitle Mike...

Presentation Subtitle

Mike Brennan, Product Manager, Virtual Client Computing and Graphics

Vadim Lebedev, Technical Marketing Engineer

21 March 2019

S9881 Using Industry

Standard Benchmark

Tools to Size Graphics

Accelerated

Applications

March 21, 2019

© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Confidential© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

What we will cover

• Let’s talk about the basics on virtualizing pro graphics apps

• How do you measure performance?

• Key NVIDIA cards

• Sample benchmark performance

• Server/GPU performance

• Where do I start with sizing?

• Cisco lineup

• Key takeaways

• Q&A

Lets talk basics

© 2019 Cisco and/or its affiliates. All rights reserved. 3

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Can you virtualize Catia and SolidWorks?

Yes you can!

Cisco has 14 hardware/ software combinations certified

Dassault VDI Certifications

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Why replace physical graphics workstations?

OPEXCAPEX

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Key performance requirements for Virtual Workstations

MonitorsMulti-user graphics card scheduling engine

User requirements CPU and memory performance

Software requirements Graphics card oversubscription

Display resolution Frame rateFPS

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

User requirements

User roleConcurrent applications open

Complexity of graphic application

Collaboration requirements

Working hours

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

• Read only and design

• Small and medium sub-assemblies

• Design and render

• Large sub-assemblies and full model

• Primarily read only –documentation, project managers

• Small subsets of entire entity

User roles

Light user type

Medium user type

Heavy user type

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Software requirements

Dassault minimum requirements

Dassault support for virtualization

Dassault hardware qualification Dassault delivery partners

Dassault support model

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Display resolution and monitors

Maximum display resolution supported

Display resolution expected by user type

Number of monitors per user supported

Number of monitors expected by user type

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

CPU and memory performance

• Balance

• Frequency

• Total frequency in MHz/CPU and server

• Core count

• Planned user count

CPU selection criteria Memory selection criteria

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Graphics card oversubscription

• NVIDIA concept

• Based on scheduler chosen

• For the T4 card, light user could get more than 12.5% of GPU resources

• Fixed at GPU frame buffer divided by vGPU profile

• For an NVIDIA P4 card

• For a 2Q profile: 8GB frame buffer/2GB frame buffer per user = 4 Users per card.

User count per graphics card

GPU oversubscription

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Frame rates

• Frame rate can be controlled – or not

• Frame rate can be set in the NVIDIA and Desktop Broker software policy

• For computer video displays,

• frame rate = #frames or images displayed per second

• For a given application

• Provides a mechanism to compare systems performance

• Describes a mechanism by which system requirements can be stated

The great equalizer for performance

Virtual Graphics Workstation insights

FPS

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Multi-user graphics cards scheduling engines

• Each user gets the same dedicated performance at all times

• Each VM gets and equal share of the GPU resources

• User gets GPU resources based on current availability

• At any given point in time a user MIGHT get more than his fair share of GPU

Best effort (default)

Fixed share Equal share

NVIDIA supports 3 models

How do you measure performance?

© 2019 Cisco and/or its affiliates. All rights reserved. 15

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

• SPECviewperf 13

• PassMark Software

• Unigen Heaven, Valley, etc

• Others

Performance measurement

Industry graphic benchmark examples

SPECviewperf 13 supports nine Virtual Professional Graphics Applications

SPECviewperf 13 provides a composite benchmark score across all nine applications

SPECviewperf 13 provides capability to score individual applications

SPECviewperf 13 provides ability to measure performance across various graphic card, CPU, memory, scheduling and frame rate scenarios

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

SPECviewperf 13 has the following minimum requirements:

Microsoft Windows 10 64-bit RS3 or later VM

OpenGL 4.0

Direct X12 support

8GB of installed system memory

80GB available disk space

1920x1080 screen resolution for submissions published on the SPEC website

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Applications driving large TAM, Verticals

Engineering2d/3d design, video Oil exploration

Energy (OpendTect)

Medical

ImageVis3D

Manufacturing, Auto Oil & Gas

Utilities, Sled, Arch/Design/Constr

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

SPECviewperf 13 Test Console

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

SPECviewperf 13 Test Results - Composite

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

SPECviewperf 13 Test Results - Configuration

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

SPECviewperf 13 Test Results – Viewset Catia

NVIDIA Tesla T4 and P40

© 2019 Cisco and/or its affiliates. All rights reserved. 23

Tesla T4 Key Specifications

Tesla P6 Key Specifications

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Tesla P40 Key Specifications

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Example Benchmark Insights

© 2019 Cisco and/or its affiliates. All rights reserved. 28

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

SPECviewperf 13 benchmark resultsCompare three cards, two profiles, 1 VM, best effort, FRL On, Xeon 6140

24.66

37.89

27.46

7.42

34.74

32.19

28.79

51.1

34.61

26.66

62.68

49.97

9.55

63.04

42.94

33.02

64.59

44.52

25.74

62.93

49.69

10.78

60.85

31.21

33.57

64.1

44.12

0 10 20 30 40 50 60 70

3dsmax-06

catia-05

creo-02

energy-02

maya-05

medical-02

showcase-02

snx-03

sw-04

P4-4Q P40-4Q P6-4Q

25.96

62.99

48.61

11.82

61.37

34.09

32.63

64.11

41.46

26.74

63.01

52.76

20.72

62.65

42.84

32.36

63.92

44.31

26.06

63.1

49.86

10.95

60.92

31.46

33.16

63.24

44.77

0 10 20 30 40 50 60 70

3dsmax-06

catia-05

creo-02

energy-02

maya-05

medical-02

showcase-02

snx-03

sw-04

P4-8Q P40-8Q P6-8Q

1 VM 4Q on XenServer Host 1 VM 8Q on XenServer Host

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

SPECviewperf 13 benchmark resultsCompare one card, two profiles, 1 VM & max VMs, best effort and FRL On, Xeon 6140

23.86

57.92

45.24

2.85

47.8

16.21

24.59

62.26

40.37

25.74

62.93

49.69

10.78

60.85

31.21

33.57

64.1

44.12

0 10 20 30 40 50 60 70

3dsmax-06

catia-05

creo-02

energy-02

maya-05

medical-02

showcase-02

snx-03

sw-04

P4-4Q-1vm P4-4Q-12vm

25.13

61.42

49.24

10.56

59.19

30.33

34.16

64.29

40.78

26.06

63.1

49.86

10.95

60.92

31.46

33.16

63.24

44.77

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70

3dsmax-06

catia-05

creo-02

energy-02

maya-05

medical-02

showcase-02

snx-03

sw-04

P4-8Q-1vm P4-8Q-6vm

P-4 4Q profile testing P-4 8Q profile testing

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

SPECviewperf 13 non-benchmark results - CatiaCompare two cards, 1 VM & max VMs, best effort and FRL On, Xeon 6140

T-4 2Q, 4Q profile testing P40 8Q, 12Q

61.31

56.77

64.01

63.76

0 10 20 30 40 50 60 70

MulitpleVMTest

1VM

Test

46.51

24.28

62.95

63.87

0 10 20 30 40 50 60 70

MulitpleVMTest

1VM

Test

T4-2Q-1VM

T4-2Q-16VM

T4-4Q-1VM

T4-4Q-8VM

P40-8Q-1VM

P40-8Q-6VM

P40-12Q-1VM

P40-12Q-4VM

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

SPECviewperf 13 non-benchmark results -SolidworksCompare two cards, 1 VM & max VMs, best effort and FRL On, Xeon 6140

T-4 2Q, 4Q profile testing P40 8Q, 12Q

41.33

40.92

44.09

44.01

0 10 20 30 40 50 60 70

MulitpleVMTest

1VM

Test

38.8

35.17

44.07

44.06

0 10 20 30 40 50 60 70

MulitpleVMTest

1VM

Test

T4-2Q-1VM

T4-2Q-16VM

T4-4Q-1VM

T4-4Q-8VM

P40-8Q-1VM

P40-8Q-6VM

P40-12Q-1VM

P40-12Q-4VM

© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

SPECviewperf 13 non-benchmark results - CatiaCompare two cards, max VMs, best effort and FRL On, Xeon 6136, 6128

T-4 2Q, 4Q profile, XEON 6136 P40 8Q, 12Q Profile, XEON 6128

22.71

44.88

0 10 20 30 40 50

T4-2Q-16VM

T4-4Q-8VM

58.51

62.69

0 10 20 30 40 50 60 70

P40-8Q-6VM

P40-12Q-4VM

Tying the Benchmarks to CPUs and GPUs

© 2019 Cisco and/or its affiliates. All rights reserved. 34

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Intel Scalable Family 6140 and NVIDIA Tesla T4*

Intel Xeon 6140 Utilization NVIDIA Tesla T4 Utilization

* SPECviewperf 13 Catia Test – ESXi Host Data

0

10

20

30

40

50

60

70

80

90

100

T4-2Q-16VM T4-4Q-8VM

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Windows 10 (1607) VM with Tesla T4 vGPU*

Perfmon VM Processor Utilization Perfmon VM GPU Utilization

* SPECviewperf 13 Catia Test – Single VM in multiple VM test Perfmon Data with Xeon 6140 host processor

0

10

20

30

40

50

60

70

80

90

100

T4-2Q-16VM T4-4Q-8VM

0

10

20

30

40

50

60

70

80

90

100

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70

T4-2Q-16VM T4-4Q-8VM

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Intel Scalable Family 6136 and NVIDIA Tesla T4*

Intel Xeon 6136 Utilization NVIDIA Tesla T4 Utilization

* SPECviewperf 13 Catia Test – ESXi Host Data

0

10

20

30

40

50

60

70

80

90

100

T4-2Q-16VM T4-4Q-8VM

0

10

20

30

40

50

60

70

80

90

100

T4-2Q-16VM T4-4Q-8VM

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Windows 10 (1607) VM with Tesla T4 vGPU*

Perfmon VM Processor Utilization Perfmon VM GPU Utilization

* SPECviewperf 13 Catia Test – Single VM in multiple VM test Perfmon Data with Xeon 6136 host processor

0

10

20

30

40

50

60

70

80

90

100

T4-2Q-16VM T4-Q4-8VM

0

10

20

30

40

50

60

70

80

90

100

T4-2Q-16VM CPU Usage T4-Q4-8VM CPU Usage

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Intel Scalable Family 6140 and NVIDIA Tesla P40*

Intel Xeon 6140 Utilization NVIDIA Tesla P40 Utilization

* SPECviewperf 13 Catia Test – ESXi Host Data

0

10

20

30

40

50

60

70

80

90

100

P40-8Q-6VM P40-12Q-4VM

0

10

20

30

40

50

60

70

80

90

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51

P40-8Q-6VM P40-12Q-4VM

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Windows 10 (1607) VM with Tesla P40 vGPU*

Perfmon VM Processor Utilization Perfmon VM GPU Utilization

* SPECviewperf 13 Catia Test – Single VM in multiple VM test Perfmon Data with Xeon 6140 host processor

0

10

20

30

40

50

60

70

80

90

100

1 3 5 7 9 11131517192123252729313335373941434547495153

P40-8Q-6VM P40-12Q-4VM

0

10

20

30

40

50

60

70

80

90

100

P40-8Q-6VM P40-12Q-4VM

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Intel Scalable Family 6128 and NVIDIA Tesla P40*

Intel Xeon 6128 Utilization NVIDIA Tesla P40 Utilization

* SPECviewperf 13 Catia Test – ESXi Host Data

0

10

20

30

40

50

60

70

80

90

100

P40-8Q-6VM P40-12Q-4VM

0

10

20

30

40

50

60

70

80

90

100

P40-8Q-6VM P40-12Q-4VM

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Windows 10 (1607) VM with Tesla P40 vGPU*

Perfmon VM Processor Utilization Perfmon VM GPU Utilization

* SPECviewperf 13 Catia Test – Single VM in multiple VM test Perfmon Data with Xeon 6128 host processor

0

10

20

30

40

50

60

70

80

90

100

P40-8Q-6VM P40-12Q-4VM

0

10

20

30

40

50

60

70

80

90

100

P40-8Q-6VM P40-12Q-4VM

Sizing for Dassault Apps

© 2019 Cisco and/or its affiliates. All rights reserved. 43

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Dassault Systemes 3DEXPERIENCE UCS C240 M5 Rack Server starting points*

User typeEquivalent

performanceUsers/server

vCPU/user

Memory/user

Server CPUServer

memoryNVIDIA GPU

Quadroprofile

Storagetype

Network

LightQuadroP1000

32 4 12-16Intel Xeon

6136768 Tesla T4 (4) T4-2Q Flash 10Gb+

MediumQuadroP2000

16 4-6 16-32Intel Xeon

6134768 Tesla T4 (4) T4-4Q Flash 10Gb+

HeavyQuadroP5000

4-6 8-12 96+Intel Xeon

6128768 Tesla P40 (2)

P40-8QP40-12Q Flash 10Gb+

*The recommendations above reflect starting points. Customers should perform PoCs to determine optimal configurations for their specific environments. Cisco can help.

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Dassault Systemes 3DEXPERIENCE UCS B200 M5 Blade Server Rack Dense starting points*

User typeEquivalent

performanceUsers/server

vCPU/user

Memory/user

Server CPUServer

memoryNVIDIA GPU

Quadroprofile

Storagetype

Network

LightQuadroP1000

12 4 12-16Intel Xeon

6128192 Tesla P6 (2) P6-2Q Flash 10Gb+

MediumQuadroP2000

6 4-6 16-32Intel Xeon

6128192 Tesla P6 (2) P6-4Q Flash 10Gb+

HeavyQuadroP5000

2-4 8-12 96+Intel Xeon

6128192 Tesla P6 (2)

P6-8QP6-16Q Flash 10Gb+

*The recommendations above reflect starting points. Customers should perform PoCs to determine optimal configurations for their specific environments. Cisco can help.

The Cisco Lineup

© 2019 Cisco and/or its affiliates. All rights reserved. 46

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Cisco graphics accelerated Data Center with NVIDIA

C240 M5

2x NVIDIA V100,P100,P40

2x M10,M60,

6x P4, 4x T4

UCS

C240 M5

S

X

NVME SSD

800 GBNVMEHWH800

2 TBHD2T7KL6GN

SATA HDD

XX

NVME SSD

800 GBNVMEHWH800

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

821 76543

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

14131211109 201918171615 24232221

6x NVIDIA V100,P100, P40, M60

3x M10

C480 M5

UCS

C480 M5

X

NVME SSD

800 GBNVMEHWH800

2 TBHD2T7KL6GN

SATA HDD

XX

NVME SSD

800 GBNVMEHWH800

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X X

NVME SSD

800 GBNVMEHWH800

X

NVME SSD

800 GBNVMEHWH800

X

NVME SSD

800 GBNVMEHWH800

2 TBHD2T7KL6GN

SATA HDD

XX

NVME SSD

800 GBNVMEHWH800

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X

2 TBHD2T7KL6GN

SATA HDD

X X

NVME SSD

800 GBNVMEHWH800

X

NVME SSD

800 GBNVMEHWH800

ReW ritabl eRECORDER

M U L T IDVD+ReWritable

S

X

X

B200 M5

2x NVIDIA P6 GPU/blade Up to 16x per chassis

! ResetConsole

UCS-HD300G10L12G126bps 10K SAS300GB

!

UCS B200 M5

UCS-HD300G10L12G126bps 10K SAS300GB

!

B480 M5

4x P6 GPU/blade, Up to 16x per chassis

! ResetConsole

UCS-HD300G10K12G

12Gbps 10K SAS

300GB

! UCS-HD300G10K12G

12Gbps 10K SAS

300GB

!

M4 SRVN

UCS-HD300G10K12G

12Gbps 10K SAS

300GB

!UCS-HD300G10K12G

12Gbps 10K SAS

300GB

!

C220 M5

2x NVIDIA P4, T4

HyperFlex 240C M5

2x NVIDIA V100, P40

2x M10, M60

6x P4

Blades Hyperconverged

8x NVIDIA V100 32GBNvLINK Interconnect

C480 ML M5

Racks

Key takeaways

© 2019 Cisco and/or its affiliates. All rights reserved. 48

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Understanding the different types of users

There are three key GPU settings:• GPU scheduler• NVIDIA profile selection• Frame rate control

• NVIDIA Tesla card• Desktop Broker

CPU selection is critical• CPU and GPU work

synergistically• High frequency• CPU core count

High frequency memory

Keep these things in mind

Resources

© 2019 Cisco and/or its affiliates. All rights reserved. 50

© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

VCC Design Navigator Your source for VDI content

http://cisco.com/go/vdi-cvd

Q & A

© 2019 Cisco and/or its affiliates. All rights reserved. 52