HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure –...

19
1 HANDS ON: AI/ML plus HPC Craig Gardner, Technical Training Architect Member of OpenHPC Technical Steering Committee [email protected] Alessandro Festa, Senior Product Manager Technical Specialist, AI+ML [email protected]

Transcript of HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure –...

Page 1: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

1

HANDS ON: AI/ML plus HPC

Craig Gardner, Technical Training ArchitectMember of OpenHPC Technical Steering [email protected]

Alessandro Festa, Senior Product ManagerTechnical Specialist, [email protected]

Page 2: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

2

Perfect Marriage

Nothing works as well together as:

Machine Learning

Artificial General Intelligence

High Performance Enterprise Computing

Page 3: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

3

Demonstration Objectives

● We can’t make you data scientists

● We can expose you to industry standards and

movement

● We will give you some interesting experience

● You will have a glimpse into what SUSE can do

for you

Page 4: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

4

Demonstration Agenda

1) Short Introduction

2) Setup SUSE HPC environment (in an AWS Cloud)

3) Explore MPICH for distributed workloads in HPC

4) Setup JupyterHub in the HPC Cloud

5) Scale the Jupyter Notebook workload with SUSE

HPC

Page 5: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

5

Short Introduction

Page 6: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

6

Conflict in the Enterprise

DevOps and SysAdmins

– Know their systems and infrastructure

– Know nothing about Data Science

Data Scientists

Know all about their data, models, and tools

– Know just enough about the systems to be dangerous

Page 7: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

7

Narrow the Gap

Reduce the headaches for DevOps

Provide an easy environment for Data Scientists

This Demonstration:

– HPC and AI/ML are far bigger than can be fully demonstrated

– This demo is just a glimpse into the Perfect Marriage

Page 8: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

8

Setup HPC

Page 9: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

9

Setup SUSE HPC environment (in an AWS Cloud)

● Show AWS servers

– 1 head + 4 compute, all running SLES 15 SP1

● Install HPC Module for SLES

● Setup NIS/NFS

● Create a data scientist user

Page 10: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

10

Explore MPICH

Page 11: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

11

Explore MPICH for distributed workloads in HPC

● Message Passing Interface for HPC

● There are other options for workloads

– openmpi, slurm, mvapich

● In this demo, simply give compute nodes some

work via MPI

Page 12: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

12

JupyterHub

Page 13: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

13

Setup JupyterHub in the HPC Cloud

● JupyterHub: broad collaboration tool for

Jupyter Notebooks

– Jupyter Notebooks: notations for repeating data science through live

code, equations, and visualizations, guided by inline instructions

● Install and Configure JupyterHub

– Runs on the HPC head node

– Largely based on Python, but other tools expand its functionality, too

– iPython facilitates the means of distributing work

Page 14: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

14

Jupyter Notebook in HPC

Page 15: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

15

Scale the Jupyter Notebook workload with SUSE HPC

● Setup standard user, called scientist

● As scientist, run Jupyter Notebook

● As scientist, run Jupyter Notebook with

distributed MPICH and HPC resources

Page 16: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

16

Page 17: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

17

Unpublished Work of SUSE LLC. All Rights Reserved.

This work is an unpublished work and contains confidential, proprietary and trade secret information of SUSE LLC. Access to this work is restricted to SUSE employees who have a need to know to perform tasks within the scope of their assignments. No part of this work may be practiced, performed, copied, distributed, revised, modified, translated, abridged, condensed, expanded, collected, or adapted without the prior written consent of SUSE. Any use or exploitation of this work without authorization could subject the perpetrator to criminal and civil liability.

General Disclaimer

This document is not to be construed as a promise by any participating company to develop, deliver, or market a product. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. SUSE makes no representations or warranties with respect to the contents of this document, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. The development, release, and timing of features or functionality described for SUSE products remains at the sole discretion of SUSE. Further, SUSE reserves the right to revise this document and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. All SUSE marks referenced in this presentation are trademarks or registered trademarks of Novell, Inc. in the United States and other countries. All third-party trademarks are the property of their respective owners.

Page 18: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,

18

Page 19: HANDS ON: AI/ML plus HPC€¦ · DevOps and SysAdmins – Know their systems and infrastructure – Know nothing about Data Science Data Scientists Know all about their data, models,