Download - CS 744: MESOS

CS 744: MESOS

Shivaram VenkataramanFall 2021

HI

ADMINISTRIVIA

- Assignment 1: Due tonight!- Assignment 2 out soon- Project details

- Create project groups- Bid for projects/Propose your own- Work on Introduction- Final report / poster presentation

→ Finalize title ofproject

→ 2 page

Scalable Storage Systems

Datacenter Architecture

Resource Management

Computational Engines

Machine Learning SQL Streaming Graph

Applications

MapReduce

¥-5→ Allocate resources

-

MapReduce

GFS

Spark

BACKGROUND: OS SCHEDULING

code, static dataheap

stack


stack


stack

CPU

How do we share CPU between

processes ?

Policy :mechanismwhich

± pre-empt processis

processes P1 P2 P3chosen

to

run

Time sharingP1 / P2 / P} / P1a -

-

I

0 10 ! 30

CLUSTER SCHEDULINGTime sharing

→ - partitionresources

Émemore expensive

m.im#s-*g.....;:::-across jobs

-Aware of locality failures / Fault Tolerance

spark

☒

TARGET ENVIRONMENT

Multiple MapReduce versions

Mix of frameworks: MPI, Spark, MR

Data sharing across frameworks

Avoid per-framework clusters

--

.

]f- Data

1 I iii-jiitiiir.me= utilization

DESIGNTwo - level scheduling

Frameworkscheduler

→

Centralisedmaster →

91

11

Agenton simplicity

everyacross frameworksmachine

RESOURCE OFFERSMPI

Resourceoffer

,== =#1,1: frmwhich frame

w

should weoffer

Heartbeats - -

=-

(information)→- -

d~

CONSTRAINTS

Examples of constraintsData locality à soft constraintGPU machines à hard constraint

Constraints in Mesos:Applications can reject offersOptimization: Filters

→ prefer torun task at location

→ cannot start task if

constraint notsatisfied

-

↳ reduces the number of refuted offers

DESIGN DETAILS

Allocation: Tasks are short, allocate when they finishLong tasks? Revocation beyond guaranteed

IsolationContainers (Docker)

← allocation]I

MPI : 8 machines

will not revoke guaranteed resources-

a.groups ( him, support } if wing µ machines

can revoke 4machines

FAULT TOLERANCE"

soft"

state

*forward

to Hadoop

HANDLING PLACEMENT PREFERENCES

What is the problem?More frameworks have preferred nodes than availableWho gets the offers?

How do we do allocations?Lottery scheculing – offers weighted by num allocations

pour TF"¥;to? ⇒

÷t

You weigh resources

proportional to priority / weight

CENTRALIZED vs DISTRIBUTED

Framework complexity

Fragmentation, Starvation

Inter-dependent framework

→ Every framework der

needs to implement a

scheduler

Min otter -4 especially if a jobhas large

fire resource needs 8GB,

8 °

↳ 2 frameworks cannot be collocated

COMPARISON: YARN

Per-job scheduler

AM asks for resourceRM replies

centralizedscheduler Apache HadoopI

↳ supportMR versions

=1.

COMPARISON: BORG

Single centralized scheduler

Requests mem, cpu in cfgPriority per user / service

Support for quotas / reservations

→ Google

-- -

bin packing←

SUMMARY

• Mesos: Scheduler to share cluster between Spark, MR, etc.• Two-level scheduling with app-specific schedulers• Provides scalable, decentralized scheduling• Pluggable Policy ? Next class!

DISCUSSIONhttps://forms.gle/FSkKVbu94nLA4g3v9

What are some problems that could come up if we scale from 10 frameworks to 1000 frameworks in Mesos?

Overhead of resource offers goes up linearlyAllocation

→ Time K allocate can go up ! module

→ Developer complexity to implement 1000schedulers

mightbeslowerFailures

→ Time to recover from Master failure

goes up !

µ. g.www.gg.agneyay.ngm.n.nmgy.n#

Maros¥:*"

pleadto perf

shore

grey:> stat

"

elastic!

ii. ;):*

§Livia

0

Boo

tv0

NEXT STEPS

Next class: Scheduling Policy

Further reading • https://www.umbrant.com/2015/05/27/mesos-omega-borg-a-survey/• https://queue.acm.org/detail.cfm?id=3173558