Download - Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Transcript
Page 1: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Mesos and YARN

Amir H. [email protected]

Amirkabir University of Technology(Tehran Polytechnic)

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 1 / 49

Page 2: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Motivation

I Rapid innovation in cloud computing.

I No single framework optimal for all applications.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 2 / 49

Page 3: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Motivation

I Rapid innovation in cloud computing.

I No single framework optimal for all applications.

I Running each framework on its dedicated cluster:• Expensive• Hard to share data

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 2 / 49

Page 4: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Proposed Solution

Running multiple frameworks on a single cluster

Maximize utilizationShare data between frameworks

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 3 / 49

Page 5: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Proposed Solution

Running multiple frameworks on a single cluster

Maximize utilizationShare data between frameworks

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 3 / 49

Page 6: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Two Resource Management Systems ...

I Mesos

I YARN

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 4 / 49

Page 7: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Two Resource Management Systems ...

I Mesos

I YARN

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 4 / 49

Page 8: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Mesos

Mesos

A common resource sharing layer, over which diverseframeworks can run

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 5 / 49

Page 9: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Mesos

Mesos

A common resource sharing layer, over which diverseframeworks can run

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 5 / 49

Page 10: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Mesos Goals

I High utilization of resources

I Support diverse frameworks (current and future)

I Scalability to 10,000’s of nodes

I Reliability in face of failures

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 6 / 49

Page 11: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Computation Model

I A framework (e.g., Hadoop, Spark) manages and runs one or morejobs.

I A job consists of one or more tasks.

I A task (e.g., map, reduce) consists of one or more processes runningon same machine.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 7 / 49

Page 12: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Computation Model

I A framework (e.g., Hadoop, Spark) manages and runs one or morejobs.

I A job consists of one or more tasks.

I A task (e.g., map, reduce) consists of one or more processes runningon same machine.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 7 / 49

Page 13: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Computation Model

I A framework (e.g., Hadoop, Spark) manages and runs one or morejobs.

I A job consists of one or more tasks.

I A task (e.g., map, reduce) consists of one or more processes runningon same machine.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 7 / 49

Page 14: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Computation Model

I A framework (e.g., Hadoop, Spark) manages and runs one or morejobs.

I A job consists of one or more tasks.

I A task (e.g., map, reduce) consists of one or more processes runningon same machine.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 7 / 49

Page 15: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Mesos Design Elements

I Fine-grained sharing

I Resource offers

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 8 / 49

Page 16: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Fine-Grained Sharing

I Allocation at the level of tasks within a job.

I Improves utilization, latency, and data locality.

Coarse-grained sharing Fine-grained sharing

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 9 / 49

Page 17: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Resource Offer

I Offer available resources to frameworks, let them pick which re-sources to use and which tasks to launch.

I Keeps Mesos simple, lets it support future frameworks.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 10 / 49

Page 18: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Question?

How to schedule resource offering among frameworks?

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 11 / 49

Page 19: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Schedule Frameworks

I Global scheduler

I Distributed scheduler

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 12 / 49

Page 20: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Global Scheduler (1/2)

I Job requirements• Response time• Throughput• Availability

I Job execution plan• Task DAG• Inputs/outputs

I Estimates• Task duration• Input sizes• Transfer sizes

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 13 / 49

Page 21: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Global Scheduler (2/2)

I Advantages• Can achieve optimal schedule.

I Disadvantages• Complexity: hard to scale and ensure resilience.• Hard to anticipate future frameworks requirements.• Need to refactor existing frameworks.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 14 / 49

Page 22: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Distributed Scheduler (1/3)

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 15 / 49

Page 23: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Distributed Scheduler (2/3)

I Unit of allocation: resource offer• Vector of available resources on a node• For example, node1: 〈1CPU, 1GB〉, node2: 〈4CPU, 16GB〉

I Master sends resource offers to frameworks.

I Frameworks select which offers to accept and which tasks to run.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 16 / 49

Page 24: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Distributed Scheduler (3/3)

I Advantages• Simple: easier to scale and make resilient.• Easy to port existing frameworks, support new ones.

I Disadvantages• Distributed scheduling decision: not optimal.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 17 / 49

Page 25: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Mesos Architecture (1/4)

I Slaves continuously send status updates about resources to the Master.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 18 / 49

Page 26: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Mesos Architecture (2/4)

I Pluggable scheduler picks framework to send an offer to.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 19 / 49

Page 27: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Mesos Architecture (3/4)

I Framework scheduler selects resources and provides tasks.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 20 / 49

Page 28: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Mesos Architecture (4/4)

I Framework executors launch tasks.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 21 / 49

Page 29: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Question?

How to allocate resources of different types?

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 22 / 49

Page 30: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Single Resource: Fair Sharing

I n users want to share a resource, e.g., CPU.• Solution: allocate each 1

n of the shared resource.

CPU

I Generalized by max-min fairness.• Handles if a user wants less than its fair share.• E.g., user 1 wants no more than 20%.

I Generalized by weighted max-min fairness.• Give weights to users according to importance.• E.g., user 1 gets weight 1, user 2 weight 2.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 23 / 49

Page 31: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Single Resource: Fair Sharing

I n users want to share a resource, e.g., CPU.• Solution: allocate each 1

n of the shared resource.

CPU

I Generalized by max-min fairness.• Handles if a user wants less than its fair share.• E.g., user 1 wants no more than 20%.

I Generalized by weighted max-min fairness.• Give weights to users according to importance.• E.g., user 1 gets weight 1, user 2 weight 2.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 23 / 49

Page 32: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Single Resource: Fair Sharing

I n users want to share a resource, e.g., CPU.• Solution: allocate each 1

n of the shared resource.

CPU

I Generalized by max-min fairness.• Handles if a user wants less than its fair share.• E.g., user 1 wants no more than 20%.

I Generalized by weighted max-min fairness.• Give weights to users according to importance.• E.g., user 1 gets weight 1, user 2 weight 2.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 23 / 49

Page 33: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Max-Min Fairness

I 1 resource: CPU

I Total resources: 20 CPU

I User 1 has x tasks and wants 〈1CPU〉 per task

I User 2 has y tasks and wants 〈2CPU〉 per task

max(x, y) (maximize allocation)subject tox+ 2y ≤ 20 (CPU constraint)x = 2y

sox = 10

y = 5

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 24 / 49

Page 34: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Max-Min Fairness

I 1 resource: CPU

I Total resources: 20 CPU

I User 1 has x tasks and wants 〈1CPU〉 per task

I User 2 has y tasks and wants 〈2CPU〉 per task

max(x, y) (maximize allocation)

subject tox+ 2y ≤ 20 (CPU constraint)x = 2y

sox = 10

y = 5

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 24 / 49

Page 35: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Max-Min Fairness

I 1 resource: CPU

I Total resources: 20 CPU

I User 1 has x tasks and wants 〈1CPU〉 per task

I User 2 has y tasks and wants 〈2CPU〉 per task

max(x, y) (maximize allocation)subject tox+ 2y ≤ 20 (CPU constraint)x = 2y

sox = 10

y = 5

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 24 / 49

Page 36: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Max-Min Fairness

I 1 resource: CPU

I Total resources: 20 CPU

I User 1 has x tasks and wants 〈1CPU〉 per task

I User 2 has y tasks and wants 〈2CPU〉 per task

max(x, y) (maximize allocation)subject tox+ 2y ≤ 20 (CPU constraint)x = 2y

sox = 10

y = 5

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 24 / 49

Page 37: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Why is Fair Sharing Useful?

I Proportional allocation: user 1 gets weight 2, user 2 weight 1.

I Priorities: give user 1 weight 1000, user 2 weight 1.

I Reservations: ensure user 1 gets 10% of a resource, so give user 1weight 10, sum weights ≤ 100.

I Isolation policy: users cannot affect others beyond their fair share.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 25 / 49

Page 38: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Properties of Max-Min Fairness

I Share guarantee• Each user can get at least 1

n of the resource.• But will get less if her demand is less.

I Strategy proof• Users are not better off by asking for more than they need.• Users have no reason to lie.

I Max-Min fairness is the only reasonable mechanism with these twoproperties.

I Widely used: OS, networking, datacenters, ...

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 26 / 49

Page 39: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Properties of Max-Min Fairness

I Share guarantee• Each user can get at least 1

n of the resource.• But will get less if her demand is less.

I Strategy proof• Users are not better off by asking for more than they need.• Users have no reason to lie.

I Max-Min fairness is the only reasonable mechanism with these twoproperties.

I Widely used: OS, networking, datacenters, ...

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 26 / 49

Page 40: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Question?

When is Max-Min Fairness NOT Enough?

Need to schedule multiple, heterogeneous resources, e.g.,CPU, memory, etc.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 27 / 49

Page 41: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Question?

When is Max-Min Fairness NOT Enough?

Need to schedule multiple, heterogeneous resources, e.g.,CPU, memory, etc.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 27 / 49

Page 42: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Problem

I Single resource example• 1 resource: CPU• User 1 wants 〈1CPU〉 per task• User 2 wants 〈2CPU〉 per task

I Multi-resource example• 2 resources: CPUs and mem• User 1 wants 〈1CPU, 4GB〉 per task• User 2 wants 〈2CPU, 1GB〉 per task

• What is a fair allocation?

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 28 / 49

Page 43: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Problem

I Single resource example• 1 resource: CPU• User 1 wants 〈1CPU〉 per task• User 2 wants 〈2CPU〉 per task

I Multi-resource example• 2 resources: CPUs and mem• User 1 wants 〈1CPU, 4GB〉 per task• User 2 wants 〈2CPU, 1GB〉 per task

• What is a fair allocation?

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 28 / 49

Page 44: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Problem

I Single resource example• 1 resource: CPU• User 1 wants 〈1CPU〉 per task• User 2 wants 〈2CPU〉 per task

I Multi-resource example• 2 resources: CPUs and mem• User 1 wants 〈1CPU, 4GB〉 per task• User 2 wants 〈2CPU, 1GB〉 per task

• What is a fair allocation?

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 28 / 49

Page 45: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

A Natural Policy (1/2)

I Asset fairness: give weights to resources (e.g., 1 CPU = 1 GB) andequalize total value given to each user.

I Total resources: 28 CPU and 56GB RAM (e.g., 1 CPU = 2 GB)• User 1 has x tasks and wants 〈1CPU, 2GB〉 per task• User 2 has y tasks and wants 〈1CPU, 4GB〉 per task

I Asset fairness yields:

max(x, y)x+ y ≤ 28

2x+ 4y ≤ 56

4x = 6y

User 1: x = 12: 〈43%CPU, 43%GB〉 (∑

= 86%)User 2: y = 8: 〈28%CPU, 57%GB〉 (

∑= 86%)

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 29 / 49

Page 46: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

A Natural Policy (1/2)

I Asset fairness: give weights to resources (e.g., 1 CPU = 1 GB) andequalize total value given to each user.

I Total resources: 28 CPU and 56GB RAM (e.g., 1 CPU = 2 GB)• User 1 has x tasks and wants 〈1CPU, 2GB〉 per task• User 2 has y tasks and wants 〈1CPU, 4GB〉 per task

I Asset fairness yields:

max(x, y)x+ y ≤ 28

2x+ 4y ≤ 56

4x = 6y

User 1: x = 12: 〈43%CPU, 43%GB〉 (∑

= 86%)User 2: y = 8: 〈28%CPU, 57%GB〉 (

∑= 86%)

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 29 / 49

Page 47: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

A Natural Policy (1/2)

I Asset fairness: give weights to resources (e.g., 1 CPU = 1 GB) andequalize total value given to each user.

I Total resources: 28 CPU and 56GB RAM (e.g., 1 CPU = 2 GB)• User 1 has x tasks and wants 〈1CPU, 2GB〉 per task• User 2 has y tasks and wants 〈1CPU, 4GB〉 per task

I Asset fairness yields:

max(x, y)x+ y ≤ 28

2x+ 4y ≤ 56

4x = 6y

User 1: x = 12: 〈43%CPU, 43%GB〉 (∑

= 86%)User 2: y = 8: 〈28%CPU, 57%GB〉 (

∑= 86%)

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 29 / 49

Page 48: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

A Natural Policy (2/2)

I Problem: violates share grantee.

I User 1 gets less than 50% of both CPU and RAM.

I Better off in a separate cluster with half the resources.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 30 / 49

Page 49: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Challenge

I Can we find a fair sharing policy that provides:• Share guarantee• Strategy-proofness

I Can we generalize max-min fairness to multiple resources?

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 31 / 49

Page 50: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Proposed Solution

Dominant Resource Fairness (DRF)

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 32 / 49

Page 51: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Dominant Resource Fairness (DRF) (1/2)

I Dominant resource of a user: the resource that user has the biggestshare of.

• Total resources: 〈8CPU, 5GB〉• User 1 allocation: 〈2CPU, 1GB〉

28= 25%CPU and 1

5= 20%RAM

• Dominant resource of User 1 is CPU (25% > 20%)

I Dominant share of a user: the fraction of the dominant resource sheis allocated.

• User 1 dominant share is 25%.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 33 / 49

Page 52: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Dominant Resource Fairness (DRF) (1/2)

I Dominant resource of a user: the resource that user has the biggestshare of.

• Total resources: 〈8CPU, 5GB〉• User 1 allocation: 〈2CPU, 1GB〉

28= 25%CPU and 1

5= 20%RAM

• Dominant resource of User 1 is CPU (25% > 20%)

I Dominant share of a user: the fraction of the dominant resource sheis allocated.

• User 1 dominant share is 25%.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 33 / 49

Page 53: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Dominant Resource Fairness (DRF) (2/2)

I Apply max-min fairness to dominant shares: give every user an equalshare of her dominant resource.

I Equalize the dominant share of the users.• Total resources: 〈9CPU, 18GB〉• User 1 wants 〈1CPU, 4GB〉; Dominant resource: RAM 1

9< 4

18• User 2 wants 〈3CPU, 1GB〉; Dominant resource: CPU 3

9> 1

18

I max(x, y)x+ 3y ≤ 9

4x+ y ≤ 184x18

= 3y9

User 1: x = 3: 〈33%CPU, 66%GB〉User 2: y = 2: 〈66%CPU, 16%GB〉

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 34 / 49

Page 54: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Dominant Resource Fairness (DRF) (2/2)

I Apply max-min fairness to dominant shares: give every user an equalshare of her dominant resource.

I Equalize the dominant share of the users.• Total resources: 〈9CPU, 18GB〉• User 1 wants 〈1CPU, 4GB〉; Dominant resource: RAM 1

9< 4

18• User 2 wants 〈3CPU, 1GB〉; Dominant resource: CPU 3

9> 1

18

I max(x, y)x+ 3y ≤ 9

4x+ y ≤ 184x18

= 3y9

User 1: x = 3: 〈33%CPU, 66%GB〉User 2: y = 2: 〈66%CPU, 16%GB〉

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 34 / 49

Page 55: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Dominant Resource Fairness (DRF) (2/2)

I Apply max-min fairness to dominant shares: give every user an equalshare of her dominant resource.

I Equalize the dominant share of the users.• Total resources: 〈9CPU, 18GB〉• User 1 wants 〈1CPU, 4GB〉; Dominant resource: RAM 1

9< 4

18• User 2 wants 〈3CPU, 1GB〉; Dominant resource: CPU 3

9> 1

18

I max(x, y)x+ 3y ≤ 9

4x+ y ≤ 184x18

= 3y9

User 1: x = 3: 〈33%CPU, 66%GB〉User 2: y = 2: 〈66%CPU, 16%GB〉

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 34 / 49

Page 56: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Online DRF Scheduler

I Whenever there are available resources and tasks to run:Schedule a task to the user with the smallest dominant share.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 35 / 49

Page 57: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Two Resource Management Systems ...

I Mesos

I YARN

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 36 / 49

Page 58: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

YARN

YARN

Yet Another Resource Negotiator

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 37 / 49

Page 59: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

YARN Architecture

I Resource Manager (RM)

I Application Master (AM)

I Node Manager (NM)

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 38 / 49

Page 60: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

YARN Architecture - Resource Manager (1/2)

I One per cluster• Central: global view• Enable global properties• Fairness, capacity, locality

I Job requests are submitted to RM.• To start a job (application), RM finds a container to spawn AM.

I Container• Logical bundle of resources (CPU/memory).

I No static resource partitioning.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 39 / 49

Page 61: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

YARN Architecture - Resource Manager (2/2)

I Only handles an overall resource profile for each application.• Local optimization is up to the application.

I Preemption• Request resources back from an application.• Checkpoint snapshot instead of explicitly killing jobs / migrate

computation to other containers.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 40 / 49

Page 62: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

YARN Architecture - Application Manager (1/2)

I The head of a job.

I Runs as a container.

I Request resources from RM.• # of containers/resource per container/locality ...

I Dynamically changing resource consumption,based on the containers it receives from the RM.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 41 / 49

Page 63: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

YARN Architecture - Application Manager (2/2)

I Requests are late-binding.• The process spawned is not bound to the request, but to the lease.• The conditions that caused the AM to issue the request may not

remain true when it receives its resources.

I Can run any user code, e.g., MapReduce, Spark, etc.

I AM determines the semantics of the success or failure of the con-tainer.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 42 / 49

Page 64: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

YARN Architecture - Node Manager (1/2)

I The worker daemon.

I Registers with RM.

I One per node.

I Report resources to RM: memory, CPU, ...

I Containers are described by a Container Launch Context (CLC).• The command necessary to create the process• Environment variables• Security tokens• ...

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 43 / 49

Page 65: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

YARN Architecture - Node Manager (2/2)

I Configure the environment for task execution.

I Garbage collection.

I Auxiliary services.• A process may produce data that persist beyond the life of the

container.• Output intermediate data between map and reduce tasks.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 44 / 49

Page 66: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

YARN Framework (1/2)

I Submitting the application: passing a CLC for the AM to the RM.

I When RM starts the AM, it should register with the RM.• Periodically advertise its liveness and requirements over the

heartbeat protocol.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 45 / 49

Page 67: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

YARN Framework (2/2)

I Once the RM allocates a container, AM can construct a CLC tolaunch the container on the corresponding NM.

• It monitors the status of the running container and stop it when theresource should be reclaimed.

I Once the AM is done with its work, it should unregister from theRM and exit cleanly.

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 46 / 49

Page 68: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Mesos vs. YARN

I Similarities:• Both have schedulers at two levels.

I Differences:• Mesos is an offer-based resource manager, whereas YARN has a

request-based approach.• Mesos uses framework schedulers for inter-job scheduling, whereas

YARN uses per-job optimization through AM (however, per-job AMhas higher overhead compare to Mesos).

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 47 / 49

Page 69: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Summary

I Resource management: Mesos and YARN

I Mesos• Offered-based• Max-Min fairness: DRF

I YARN• Request-based• RM, AM, NM

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 48 / 49

Page 70: Mesos and YARN - Amir H. Payberah › files › download › dic14 › mesos.pdf · I Aframework(e.g., Hadoop, Spark) manages and runs one or more jobs. I Ajobconsists of one or moretasks.

Questions?

Acknowledgements

Some slides were derived from Ion Stoica and Ali Ghodsi slides(Berkeley University), and Wei-Chiu Chuang slides (Purdue University).

Amir H. Payberah (Tehran Polytechnic) Mesos and YARN 1393/9/15 49 / 49