Mesos Con 2016 Optimistic Offer

29
Optimistic Offer - What does it mean to Mesos Frameworks? Yong Feng MesosCon North America 2016, 6/01/2016 1 [email protected]

Transcript of Mesos Con 2016 Optimistic Offer

Page 1: Mesos Con 2016 Optimistic Offer

Optimistic Offer - What does it mean to Mesos Frameworks?

Yong Feng

MesosCon North America 2016, 6/01/2016

1

[email protected]

Page 2: Mesos Con 2016 Optimistic Offer

Overview

• The Pessimistic Offer Programming Model • Implementing frameworks to use Pessimistic Offer

• Observed limitations and proposed mitigations

• The Optimistic Offer Programming Model • Benefits

• Design Plan

• Development status

• Optimistic vs Pessimistic Offer

• Discussion and Additional Resources

2

Page 3: Mesos Con 2016 Optimistic Offer

The Pessimistic Offer Programming Model

• Offer: allocate available resources in a host to a single framework. Allocated resources are locked (cached) for a period of time.

• Resources are allocated or partitioned among frameworks without any knowledge of a framework’s requirements

• Resources in an offer is either consumed by launching task, or freed by rejecting or rescinding of an offer

• Frameworks cannot consume or reject parts of an offer while still keeping the left over resources of that offer

• Resources are only available for re-allocation after the offer is rejected/rescinded or the task/executor is finished

• Inverse Offer: deallocate (or reclaim) resources from a framework

3

Free Allocated Used Framework launch task

Reject or rescind

Offer to framework

Task or executor finished

Page 4: Mesos Con 2016 Optimistic Offer

Framework Implementation: Cache Offers

4

• What if a framework has no suitable demand when an offer is received?

• They cache it ! • Caching resources can lead to:

• Better performance for future demand

• Better scheduling decision with more candidate hosts, for example affinity …

• Stockpiled resources for future tasks that have higher demands

• Still want to be a good citizen? • Allow for TTL (expiration) of cached offers

• Frameworks that support TTL of cached offers: • Swarm

• Kubernetes

Page 5: Mesos Con 2016 Optimistic Offer

Framework Implementation: Revive Offers

5

• What if there is no demand to consume the offers in a framework?

• Reject it ! (or send suppress request)

• Rejecting or suppressing unwanted resources allow for:

• Better resource utilization by allocating resources to frameworks that have a demand

• What if there is a demand in a framework, but no offer?

• Revive the offer

• Send a request to Mesos to reallocate the offer

• Frameworks that support Suppress/Revive offer:

• Marathon

Page 6: Mesos Con 2016 Optimistic Offer

Pessimistic Offer: Limitations

Inefficient resource allocation Mitigation

Cached resource are not released to other

frameworks in time

Over a period of time, each offer only includes

small pieces of each resource in a host. To fix

this, offer TTL must be enforced by Mesos and

Mesos frameworks.

PO – Pessimistic Offer

OO – Optimistic Offer

Average time to receive offers (ms) for multiple client frameworks (Pessimistic vs Optimistic Offer)

6

Page 7: Mesos Con 2016 Optimistic Offer

Pessimistic Offer: Limitations (Cont’d) Poor scheduling quality Mitigation

Offered resource may not be the most suitable

resource

Frameworks should delay the scheduling decision

until they get suitable resources

PO – Pessimistic Offer

OO – Optimistic Offer

Job duration time (ms) for multiple client frameworks (Pessimistic vs Optimistic Offers)

7

Page 8: Mesos Con 2016 Optimistic Offer

Pessimistic Offer: Limitations (Cont’d) Low QoS guarantee Mitigation

Long tasks prevent resource re-allocation.

Hard to preempt resource among frameworks.

Implement Quotas, reservations, over-

subscription …

https://www.youtube.com/watch?v=jC8lhGQN2Sc 8

Page 9: Mesos Con 2016 Optimistic Offer

Can We Squeeze More Out of Pessimistic Offers?

9

• Adjust the size of an offer (Mesos-3765)

• Smaller sized offers improves resource utilization, and fairness

• Oversubscription for reservation (Mesos-4967)

• Lend reserved resource to other framework if it is not being used. This improves resource utilization without impacting QoS

• Reuse the recovered resource without delay (Mesos-3078 or Mesos-4811)

• Accelerate scheduling so that resources can be used once it becomes available

• More hints in filter when rejecting offer

• Avoid ping-pong of resources between Mesos and framework to improve performance and avoid starvation

Page 10: Mesos Con 2016 Optimistic Offer

The Optimistic Offer Model

Offer: displays all available resources to multiple frameworks.

Page 11: Mesos Con 2016 Optimistic Offer

Benefits of the Optimistic Offer Model

• Better resource utilization

• Available resources are always visible to all frameworks

• Improved scheduling performance

• Resources recovered from finished tasks are visible to all frameworks immediately

• Enhanced quality of scheduling decision

• Increased availability of resources help frameworks to make better scheduling decisions

• QoS guarantee

• QoS of workload is enforced by pre-emption

11

Page 12: Mesos Con 2016 Optimistic Offer

The Optimistic Offer - Design Plan

• Track both offered resources and consumed resources • Offered resources are not allocated resources. Offered resources are viewed as available resources to all

framework

• Consumed resources are equivalent to allocated resource in the Pessimistic Offer model, however they might still be viewed as available resources to some frameworks under special conditions.

• Offered resources are based on resource plan or usage instead of allocation • Available resources are visible to frameworks via Offer

• Available resources are decided by resource plan and resource usage

• Refresh resource and quota availability in a timely manner • Update resource availability and status to framework by either rejecting offer from framework or rescinding

offer from master

• Update resource plan availability with quota or restrictions to each framework

• Pre-empt tasks to enforce the QoS • Pre-empt task from lower priority framework or overused framework to enforce QoS

• Export the resource status to framework to achieve “smart” pre-emption

12

Page 13: Mesos Con 2016 Optimistic Offer

The Optimistic Offer - Design Plan (Cont’d)

• Programming model:

• Allow frameworks to use offers without the need for rejection of the offer from a prior framework

• Refresh the status of a resource by handling rescind messages or rejected offers

• Use the non-revocable resource first and then revocable resource.

• Handle the inverse offer smartly for pre-emption

13

Page 14: Mesos Con 2016 Optimistic Offer

The Optimistic Offer - Development Status

• Mesos-1607 Phase 1 (renamed as “Oversubscription for reservation” in Mesos-4967)

• Target: ~4Q 2016

• Status: Reviewable; ~30 patches are ready for review, try it after applying the patches

14

Total Resources

Reserved

Allocated

Used

Revocable

Page 15: Mesos Con 2016 Optimistic Offer

The Optimistic Offer - Development Status (Cont’d)

15

• Mesos-1607 Prototype

• Target: IBM Mesos Connector

• Design: adjust allocator API to track consumed resources and resolve resource races

• Status: In Bluemix Container Cloud Service, and planned in Mesos roadmap.

Page 16: Mesos Con 2016 Optimistic Offer

Optimistic Offer vs Pessimistic Offer Model

16

Idle frameworks Idle frameworks with smaller tasks

PO – Pessimistic Offer

OO – Optimistic Offer

(Number of task per ms)

Page 17: Mesos Con 2016 Optimistic Offer

Optimistic Offer vs Pessimistic Offer Model

17 PO – Pessimistic Offer

OO – Optimistic Offer

(Number of task per ms)

Idle frameworks Idle frameworks with smaller tasks and small TTL

Page 18: Mesos Con 2016 Optimistic Offer

Optimistic Offer vs Pessimistic Offer Model

18 PO – Pessimistic Offer

OO – Optimistic Offer

(Number of task per ms)

Idle frameworks Idle frameworks when placement constraint is added

Page 19: Mesos Con 2016 Optimistic Offer

Optimistic Offer vs Pessimistic Offer Model

19 PO – Pessimistic Offer

OO – Optimistic Offer

(Job duration per ms for multiple clients)

Idle frameworks Idle frameworks with smaller tasks

Page 20: Mesos Con 2016 Optimistic Offer

Optimistic Offer vs Pessimistic Offer Model

20 PO – Pessimistic Offer

OO – Optimistic Offer

(Job duration per ms for multiple clients)

Idle frameworks Idle frameworks with smaller tasks and smaller TTL

Page 21: Mesos Con 2016 Optimistic Offer

Optimistic Offer vs Pessimistic Offer Model

21 PO – Pessimistic Offer

OO – Optimistic Offer

(slave utilization)

Idle frameworks Idle frameworks with smaller tasks

Page 22: Mesos Con 2016 Optimistic Offer

Optimistic Offer vs Pessimistic Offer Model

22 PO – Pessimistic Offer

OO – Optimistic Offer

(slave utilization)

Idle frameworks Idle frameworks with smaller tasks and smaller TTL

Page 23: Mesos Con 2016 Optimistic Offer

Evaluation – Conflicts

• 4 swarm frameworks • 10 nodes, 1000 containers (tasks)

• Task request rejected after 10 placement retries

(a) System load = 0.8 and

varied request resource size

(b) System load = 0.4,0.8,0.99

and random request resource size

0

5

10

15

20

25

small medium large varied

Perc

en

tag

e o

f ta

sks

Task resource request size

Conflicted tasks (%)

Rejected tasks (%)

0

2

4

6

8

10

12

14

16

40% 80% 99%

Perc

en

tag

e o

f ta

sks

System Load

Conflicted tasks (%)

Rejected tasks (%)

Page 24: Mesos Con 2016 Optimistic Offer

Optimistic Offer in IBM Bluemix Container Cloud

24

Mesos Master

Swarm

Scheduler Docker

CLI/API framework

Kubernetes

Scheduler

Mesos

Agents

Offers

framework

Offers

Tasks to Mesos

Kubernetes

CLI/API

Mesos

Agents

Mesos

Agents

Network

Agents

Network

Agents

Tasks to Mesos

Page 25: Mesos Con 2016 Optimistic Offer

Open Questions

• How to export more policy status and configuration to frameworks so that frameworks can make better decisions?

• How to reduce scheduling overheads, such as resource conflicts?

• …

25

Page 26: Mesos Con 2016 Optimistic Offer

IBM Booth

26

Welcome to S5

• Spark benchmark with IBM Session Scheduler

• Bluemix Container Service

• GPU in Power

• Mesos Connector

Page 27: Mesos Con 2016 Optimistic Offer

Mesos Community Activities

27

• Active development with Mesos community – 11 IBM Developers.

• 100+ JIRAs delivered or in progress

• Leading or participating in several work streams: POWER Support, Optimistic Offers, Container Support, GPU Support, Swarm and Kubernetes integration

• Relationship with Mesosphere – weekly calls, on-site developer presence

• Attendance at MesosCon 2016 with sponsorship and booth

• Aligning with IBM Container Service to leverage common OSS technologies

• Technical Preview of Mesos with IBM Value-Add (ASC) on Docker Hub – Both x86 and POWER images

Page 28: Mesos Con 2016 Optimistic Offer

IBM Committed Mesos Patches

28

For the period June 2015 to May 2016

Page 29: Mesos Con 2016 Optimistic Offer

Thank you