Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud...

75
M2: Malleable Metal as a Service Apoorve Mohan * , Ata Turk , Ravi S. Gudimetla , Sahil Tikale , Jason Hennessey , Ugur Kaynar , Gene Cooperman * , Peter Desnoyers * , and Orran Krieger * Northeastern University, Boston University, and Red Hat Inc. (Massachusetts Open Cloud)

Transcript of Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud...

Page 1: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

M2: Malleable Metal as a Service

Apoorve Mohan*, Ata Turk†, Ravi S. Gudimetla‡, Sahil Tikale†, Jason Hennessey†, Ugur Kaynar†,

Gene Cooperman*, Peter Desnoyers*, and Orran Krieger†

*Northeastern University, †Boston University, and ‡Red Hat Inc.(Massachusetts Open Cloud)

Page 2: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Increasing Bare Metal Cloud Offerings

2

❏ Performance and Security sensitive applications

❏ Application that require accelerators:● FPGA’s, Infiniband, GPUs, etc.

❏ Setting up frameworks to provide different services:● OpenStack, Hadoop, Slurm, etc.

❏ AWS Bare Metal, IBM Softlayer/Bluemix, Rackspace, Internap, etc.

Page 3: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Existing Bare Metal Offerings Provision to Local Disk - Stateful

● Over the network from an ISO or a Pre-installed image

3

COPY

Page 4: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Stateful Provisioning Problems

4

❏ Slow Provisioning● Up to Tens of Minutes to provision

❏ Boot Storms● Heavy network traffic

❏ Single Point of Failure● Loss of both OS and Application

❏ Poor Reusability● Saving and Restoring disk state

Page 5: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Poor Reusability

5

Tenant 1 Tenant 2

Page 6: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Poor Reusability

6

Tenant 1 Tenant 2

Page 7: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Poor Reusability

7

Tenant 1 Tenant 2

Page 8: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

What’s the Solution?

8

❏ Slow Provisioning● Up to Tens of Minutes to provision

❏ Boot Storms● Heavy network traffic

❏ Single Point of Failure● Loss of both OS and Application

❏ Poor Reusability● Saving and Restoring disk state non-trivial

Page 9: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Why Not Provision Bare Metal like Virtual Machines?

● Over the network from a pre-installed virtual disk (boot drive)

9

NETBOOT

Distributed Storage

Page 10: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

How Can Netboot Solve These Problems?

10

❏ Slow Provisioning● Up to Tens of Minutes to provision

❏ Boot Storms● Heavy network traffic

❏ Single Point of Failure● Loss of both OS and Application

❏ Poor Reusability● Saving and Restoring disk state non-trivial

Page 11: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

How Can Netboot Solve These Problems?

11

❏ Slow Provisioning● Up to Tens of Minutes to provision

❏ Boot Storms● Heavy network traffic

❏ Single Point of Failure● Loss of both OS and Application

❏ Poor Reusability● Saving and Restoring disk state non-trivial

● Only copy what you need

Page 12: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

How Can Netboot Solve These Problems?

12

❏ Slow Provisioning● Up to Tens of Minutes to provision

❏ Boot Storms● Heavy network traffic

❏ Single Point of Failure● Loss of both OS and Application

❏ Poor Reusability● Saving and Restoring disk state non-trivial

● Only copy what you need

● Multiple NICs and Distributed File System

Page 13: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

How Can Netboot Solve These Problems?

13

❏ Slow Provisioning● Up to Tens of Minutes to provision

❏ Boot Storms● Heavy network traffic

❏ Single Point of Failure● Loss of both OS and Application

❏ Poor Reusability● Saving and Restoring disk state non-trivial

● Only copy what you need

● Multiple NICs and Distributed File System

● Reboot from a saved image

Page 14: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

14

❏ Wait, but what about application performance?

● Won’t there be overhead due to constant access to the boot drive over the network?

Page 15: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

❏ Wait, but what about application performance?

● Won’t there be overhead due to constant access to the boot drive over the network?

❏ With TenGigabitEthernet and Fast and Reliable Distributed Storage, is this really a problem?● Separate Communication/Data and Provisioning Networks.

❏ Also, how big of a performance issue is it to have remote boot drives?

● In cloud, data already coming over the network.

15

Page 16: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

16

M2: Malleable Metal as a Service

A Multi-tenant Bare Metal Cloud Service

Page 17: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

HIL

M2 Architecture Overview

17

❏ Previously developed

❏ Bare Metal Allocation

❏ Network Allocation (layer 2)

Page 18: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

CEPHHIL

M2 Architecture Overview

18

❏ Data Store

❏ Pre-Installed Images

Page 19: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

CEPHHIL

M2 Architecture Overview

19

iSCSI Gateway

❏ Software iSCSI Server

❏ TGT Software iSCSI

Page 20: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

CEPHHIL

M2 Architecture Overview

20

iSCSI Gateway

DHCP

iPXE TFTP

❏ Diskless Booting from iSCSI target

Page 21: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

CEPHHIL

M2 Architecture Overview

21

iSCSI Gateway

DHCP

iPXE TFTP

REST Service

❏ Orchestration Engine

Page 22: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

CEPH

REST Service

USER

HIL

22

M2 Architecture Overview

Page 23: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

CEPH

Reserved Servers

1. Reserve Nodes

USER

REST Service

HIL

23

M2 Architecture Overview

Page 24: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

CEPH

2. Provision Reserved Node

USER

REST Service

1. Reserve Nodes

Reserved Servers

HIL

24

M2 Architecture Overview

Page 25: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

CEPH

REST Service

CEPHInterface

USER

3. Clone Golden Image

Cloned Images

2. Provision Reserved Node1. Reserve Nodes

Reserved Servers

HIL

25

M2 Architecture Overview

Page 26: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

CEPH

REST Service

CEPHInterface

iSCSI Gateway

USER

4. Expose Cloned Image

as iSCSI Target

3. Clone Golden Image2. Provision Reserved Node

1. Reserve Nodes

Reserved Servers

Cloned Images

HIL

26

M2 Architecture Overview

Page 27: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

CEPH

REST Service

DHCP

iSCSI Gateway

USER

5. Configure BMI to PXE

boot

CEPHInterface

Reserved Servers

4. Expose Cloned Image

as iSCSI Target

3. Clone Golden Image2. Provision Reserved Node

1. Reserve Nodes

Cloned Images

HIL

27

M2 Architecture Overview

iPXE TFTP

Page 28: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

CEPH

REST Service

HILInterface

CEPHInterface

DHCP

iSCSI Gateway

USER

6. Attach Nodes to

Provisioning NetworkReserved

Servers5. Configure BMI to PXE

boot

4. Expose Cloned Image

as iSCSI Target

3. Clone Golden Image2. Provision Reserved Node

1. Reserve Nodes

Cloned Images

HIL

28

M2 Architecture Overview

iPXE TFTP

Page 29: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

CEPH

REST Service

HILInterface

CEPHInterface

DHCP

iSCSI Gateway

USER

7. PXE Boot Reserved

Nodes

Reserved Servers

6. Attach Nodes to

Provisioning Network 5. Configure

BMI to PXE boot

4. Expose Cloned Image

as iSCSI Target

3. Clone Golden Image2. Provision Reserved Node

1. Reserve Nodes

Cloned Images

HIL

29

M2 Architecture Overview

iPXE TFTP

Page 30: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

M2 Interfaces

❏ Provision, Deprovision

❏ Create/Remove Snapshot

❏ List/Delete Images and Snapshots

Page 31: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

31

Evaluation Environment

❏ 8 nodes with ● 128 GB RAM● 24 HT-cores● Single 10 Gbps NIC (communication

and boot drive)

❏ Ceph cluster● 10 nodes (Infiniband interconnect)● 90 OSD’s● 40Gbps outlink

❏ Software iSCSI and M2 all running in a VM

Page 32: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

~ 25 Minutes

Foreman

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

32

Page 33: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Node Power Cycle

33

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

ForemanProvision

Page 34: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Power-on Self-test (POST)

34

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

ForemanProvision

Page 35: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

PXE Request

35

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

ForemanProvision

Page 36: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Kernel Download & Local Disk Installation

36

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

ForemanProvision

Page 37: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Node Power Cycle

37

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

ForemanProvision

Page 38: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Power-on Self-test (POST) & PXE Request

38

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

ForemanProvision

Page 39: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Booting OS from Local Disk

39

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

ForemanProvision

Page 40: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Hadoop Package Installation

40

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

ForemanProvision

Page 41: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Hadoop Node Configuration

41

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

ForemanProvision

Page 42: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

~ 25 Minutes

42

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

ForemanProvision

Page 43: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Node Power Cycle

M2 Provision

43

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

~ 25 Minutes

Foreman

Page 44: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Power-on Self-test (POST) & PXE Request

44

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

M2 Provision

~ 25 Minutes

Foreman

Page 45: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

OS Chain Booting (iPXE)

45

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

M2 Provision

~ 25 Minutes

Foreman

Page 46: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Hadoop Package Installation and Configuration

46

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

M2 Provision

~ 25 Minutes

Foreman

Page 47: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

~ 11 Minutes

47

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

M2 Provision

~ 25 Minutes

Foreman

Page 48: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Poor Reusability

48

Tenant 1 Tenant 2

Page 49: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Poor Reusability

49

Tenant 1 Tenant 2

Page 50: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Poor Reusability

50

Tenant 1 Tenant 2

Page 51: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

51

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

~ 25 Minutes

ForemanProvision

ForemanRe-Pro

Page 52: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

~ 11 Minutes

~ 5 Minutes 30 Seconds

● Hadoop Package Installation overhead removed ( ).

M2 Re-Pro

52

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

M2 Provision

Page 53: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

~ 5 Minutes 30 Seconds~5X

M2

53

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

~ 25 Minutes

Foreman

Page 54: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

● BMI Reduces Provisioning/Re-Provisioning Times.

● POST ( ) dominates BMI provisioning time.54

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

~ 5 Minutes 30 SecondsM2

~ 25 Minutes

Foreman

Page 55: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

● Without POST ( ) Re-Provision time ~2 Minutes.

● ~2 Minutes to spin up Virtual machines in AWS.55

Provisioning/Re-Provisioning Times Comparison(Single Hadoop Node)

~ 5 Minutes 30 SecondsM2

~ 25 Minutes

Foreman

Page 56: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

56

Cumulative Read Traffic Analysis (24 Hours - Hadoop) C

umul

ativ

e iS

CSI

Rea

ds p

er N

ode

(MB

)

100

200

300

Initial Provisioning

~170 MB of 10GB Image (< 2% of the Boot Drive is accessed)

Case (i) : Data == RAM● RAM: 128 GB● Data: 128 GB

Page 57: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

57

Cumulative Read Traffic Analysis (24 Hours - Hadoop) C

umul

ativ

e iS

CSI

Rea

ds p

er N

ode

(MB

)

100

200

300

Data Generation

Case (i) : Data == RAM● RAM: 128 GB● Data: 128 GB

Page 58: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

58

Cumulative Read Traffic Analysis (24 Hours - Hadoop) C

umul

ativ

e iS

CSI

Rea

ds p

er N

ode

(MB

)

100

200

300

Sort

Case (i) : Data == RAM● RAM: 128 GB● Data: 128 GB

Page 59: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

59

Cumulative Read Traffic Analysis (24 Hours - Hadoop) C

umul

ativ

e iS

CSI

Rea

ds p

er N

ode

(MB

)

100

200

300

Data Generation

Case (i) : Data == RAM● RAM: 128 GB● Data: 128 GB

Page 60: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

60

Cumulative Read Traffic Analysis (24 Hours - Hadoop) C

umul

ativ

e iS

CSI

Rea

ds p

er N

ode

(MB

)

100

200

300

Sort

Case (i) : Data == RAM● RAM: 128 GB● Data: 128 GB

Page 61: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

61

Cumulative Read Traffic Analysis (24 Hours - Hadoop) C

umul

ativ

e iS

CSI

Rea

ds p

er N

ode

(MB

)

Case (i) : Data == RAM● RAM: 128 GB● Data: 128 GB

100

200

300

Hadoop Operation over 24 HoursInitial

Provisioning

~ 3KB/s after initial provisioning

Page 62: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

62

Cumulative Read Traffic Analysis (24 Hours - Hadoop) C

umul

ativ

e iS

CSI

Rea

ds p

er N

ode

(MB

)

100

200

300

Hadoop Operation over 24 HoursInitial

Provisioning

Case (ii) : Data > RAM● RAM: 128 GB● Data: 256 GB

Page 63: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

63

Cumulative Write Traffic Analysis (24 Hours - Hadoop) C

umul

ativ

e iS

CSI

Rea

ds p

er N

ode

(MB

)

100

400

500

Hadoop Operation over 24 HoursInitial

Provisioning

Case (i) : Data == RAM● RAM: 128 GB● Data: 128 GB

200

300

600

700

Mostly Hadoop Logs (~14KB/s)

Page 64: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

64

Cumulative Write Traffic Analysis (24 Hours - Hadoop) C

umul

ativ

e iS

CSI

Rea

ds p

er N

ode

(MB

)

100

400

500

Hadoop Operation over 24 HoursInitial

Provisioning

200

300

600

700Case (ii) : Data > RAM● RAM: 128 GB● Data: 256 GB

Page 65: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

❏ Network traffic to boot drive is irrelevant!

● Resilient to boot storm

Page 66: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

66

NAS Parallel Benchmark Performance (HPC)

Negligible Performance Impact

Number of MPI Ranks

Run

time

(sec

s)

Page 67: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

67

Hadoop Performance (Big Data)

Negligible Performance Impact

Data Set Size

Page 68: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

● Performance Impact Due to Remote Boot Drive is Negligible!

Page 69: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

69

M2 Stress Test - Concurrent Provisioning

❏ M2 VM Configuration● 4 vCPUS● 4 GB RAM● TGT Software iSCSI● No iSCSI Caching

Total Provisioning Time

M2 API Overhead

1 2 4 6 8 12 16 24

0

50

100

150

200

250

Prov

isio

ning

Tim

e (s

ecs)

Concurrently Provisioned Servers

Page 70: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

More About M2

70

❏ M2 core is ~3500 lines of new python code (i.e. excluding HIL)

❏ Used at Massachusetts Open Cloud since Fall 2016 ❏ Used in production on dedicated 16 node cluster (with 6 hour lease)

● OS Research groups● Security researchers need access to TPM● Experimental installation OpenStack● Research groups that need deterministic performance (no virtualization)

❏ Performance re-provisioning critical for these use cases

Page 71: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Conclusion: M2 is a new bare-metal cloud service

71

❏ Overcomes the problems of stateful provisioning systems❏ Rapid provisioning❏ Resilient to boot storms❏ No single points of failure❏ Rapid re-provisioning (Reusability)

❏ Negligible performance impact

Page 72: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

In Progress

72

❏ User transparent checkpointing of the node memory state.

❏ Rapid attestation of bare metal systems before re-assigning them.

❏ Using rapid multiplexing frameworks to improve datacenter utilization.

Page 73: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

Acknowledgements

73

❏ Current and previous core M2 team members: Naved Ansari, Sourabh Bollapragada, Daniel Finn, Paul Grosu, Sirushti Murugesan and Pranay Surana.

❏ Other Mass Open Cloud (MOC) team members: Chris Hill, Laura Kamfonik, Rajul Kumar, Radoslav Nikiforov Milanov, Piyanai Saowarattitada, and Rahul Sharma.

❏ MassTech Collaborative Research Matching Grant Program, National Science Foundation awards ACI-1440788, 1347525, 1149232 and 1414119.

❏ Massachusetts Open Cloud commercial partners Brocade, Cisco, Intel, Lenovo, Red Hat, and Two Sigma.

Page 74: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

M2 is an open source projects.

We welcome you to contribute, use and provide us with feedback and suggestions to improve it.

https://github.com/CCI-MOC/ims https://info.massopencloud.org/blog/bare-metal-imaging/

Thank You

Page 75: Malleable Metal as a ervice - IEEE Computer Society - IC2E 2018.pdf · Increasing Bare Metal Cloud Offerings 2 Performance and Security sensitive applications Application that require

CEPH

REST Service

HILInterface

CEPHInterface

DHCP

TFTP iPXE

iSCSI Gateway

USER

7. PXE Boot Reserved

Nodes

Reserved Servers

6. Attach Nodes to

Provisioning Network 5. Configure

BMI to PXE boot

4. Expose Cloned Image

as iSCSI Target

3. Clone Golden Image2. Provision Reserved Node

1. Reserve Nodes

Cloned Images

HIL

75

Questions?